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ABSTRACT 

Interest in increasing programmer productivity has spawned new software 
tools. Some of these tools are statistical packages, program generators, and 
database management systems (DBMS). In the area of DBMS, research is 
ongoing to improve the efficiency of DBMS tools. One research effort to improve 
the efficiency of DBMS is the multi-lingual database system (MLDS). MLDS 
combines software and hardware technology to gain efficiency and versatility in 
DBMS. The MLDS design goals overcome the conventional limitation to develop 
a database system that supports a single data model and a corresponding model- 
based data language. Examples of data models are relational, hierarchical, 
network, and entity-relationship. Examples of corresponding model-based data 
languages are SQL, DL/I, CODASYL, and Daplex. These models and their data 
languages are supported conventionally by separate DBMS. Instead, MLDS as a 
single DBMS is capable of supporting multiple models and their respective 
database languages. 

In this theses we present a methodology for supporting entity-relationship 
database management on an attribute-based database system, since the heart of 
MLDS is the attribute-based system. Specifically, we provide the design 
specifications for transforming Daplex requests into equivalent attribute-based 
data language requests. During this design process, we describe the data 
structures, control structures, and the functions required to implement this 


transformation. 
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eNO CT LON 


Interest in increasing programmer productivity has spawned new software 
tools [Ref. 1]. Some of these tools are statistical packages, program generators, 
and database management systems (DBMS). These tools, although designed to 
increase programmer productivity, strained hardware capabilities and spawned 
many new software products all designed to increase programmer productivity. 
In the area of DBMS, research is ongoing to improve the efficiency of DBMS 
tools. This research ranges from software solutions specially designed for 
particular hardware to specially designed hardware systems for data management 
called database machines. 

One research effort to improve the efficiency of DBMS is the multi-lingual 
database system (MLDS) [Ref. 2]. MLDS combines software and hardware 
technology to gain efficiency and versatility in DBMS. The MLDS design goals 
overcome the conventional limitation to develop a database system that supports 
a single data model and a corresponding model-based data language. Examples 
of data models are relational, hierarchical, network, and entity-relationship. 
Examples of corresponding model-based data languages are SQL. DL/I, 
CODASYL, and Daplex. These models and their data languages are supported 
conventionally by separate DBMS. MLDS as a single DBMS is capable of 
supporting multiple models and their respective database languages. Other 
design theses on the design and analysis of various model interfaces for a single 
DBMS are on the hierarchical [Ref. 3, 4], network (Ref. 5], and relational [Ref. 6} 
models. MLDS transforms traditional database models into a single database 
model called the attribute-based model [Ref. 7]. The attribute-based data model 
is the heart of MLDS. Implementation theses of various model interfaces for the 
multi-lingual database system are based on the following corresponding model- 
based data languages: SQL [Ref. 8], DL/I [Ref. 9], Codasyl-DML [Ref. 10], and 
Daplex [Ref. 11]). MLDS translates these data languages into a single data 
language called the attribute-based data language (ABDL). 


A. BACKGROUND 

Just as an operating system supports a wide variety of data structures and 
programming languages, MLDS supports a wide variety of database models and 
database languages. Similarly, as an operating system has many modes of access, 
such as interactive or batch processing, MLDS provides many modes of access, 
such as individual query or multiple transaction processing. The final analogy 
between operating systems and MLDS is that they are both invisible to the user. 
In other words, MLDS supports the user’s chosen data model and language 
without making the data conversion and language translation known to the user, 
i.e., a database user inputing SQL transactions will have the answer returned in 
relational form. This allows users the latitude of continuing to use conventional 
models and languages which they are familiar with, while developing expertise in 
MLDS which offers advantages over the systems they previously use. 


To date the database models and languages that can be transformed and 


translated are: 


(1) The hierarchical database model and IBM’s Data Language I (DL/I} 
(supported conventionally by IBM’s Information Management System, 


IMS); 


(2) The relational model and IBM’s Structured English Query Language 
(SQL) (supported conventionally by IBM’s SQL/Data System); 


(3) The network model and Univac’s CODASYL Data Manipulation 
Language (CODASYL-DML) (supported conventionally by Univac’s 
CODASYL-DML/ Data System); and, 


(4) The entity-relationship model and CCA’s Daplex Language (supported 
conventionally by CCA’s Daplex/Data System). 


This thesis concentrates on the design and analysis of an entity-relationship 
language interface for the multi-lingual database system, i.e., aforementioned 


item (4). 


lee ORGANIZATION OF THE THESIS 
Chapter 2 presents an overview of the MLDS which includes an overview 


of the attribute-based data model and data language. In Chapter 3 an overview 
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of Daplex is presented which includes an overview on the entity-relationship 
model, the basis of Daplex. Chapter 4 describes the data structures required by 
the language interface to translate Daplex transactions to ABDL transactions. 
Chapter 5 describes the transformation of the Daplex schema into the ABDL 
schema. Chapter 6 describes the Daplex transaction statements and control 
commands and the method in which they are translated into ABDL transactions. 
Chapter 7 describes the proposed specification for the expansion of the design of 
data structures into implementation of the data definition language (DDL) of the 
Daplex grammar using pseudo YACC. It also describes the expansion of the 
design of translating transactions into the data manipulation language (DML) of 
the Daplex grammar using pseudo YACC. Finally, Chapter 8 summarizes the 


conclusions from our research experience. 
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Il. THE MULTI-LINGUAL DATABASE SYSTEM (MLDS) 


MLDS consists of the modules as shown in Figure 2.1. The user data model 
(UDM) and the user data language (UDL) refer to the actual database model and 
language used by a user, e.g., entity-relationship model and Daplex language. 
The transformation and translation for each database model and its 
corresponding language is handled by the four modules: the language interface 
layer (LIL), the kernel mapping system (KMS), the kernel formatting system 
(KFS), and the kernel controller (KC). These modules together are referred to as 
the language interface. There is a set of these modules for each conventional 
database system replaced. There are SQL, DL/I, CODASYL-DML, and Daplex 
interfaces. The kernel database system (KDS) executes transactions from the 
kernel data language (KDL) for the kernel data model (KDM). 

The functions and operations of the language interface are as follows. LIL 
receives the chosen database definitions and transactions respectively in the UDM 
and the UDL form and sends them to KMS. KMS transforms the database 
definition from UDM to KDM or translates the user transactions from UDL to 
KDL. KMS then forwards the transformed database definition (in the KDM 
form) or the translated transaction (in the KDL form) to KC. KC sends the 
transformed database definition or translated transaction to KDS for processing. 
Upon completion of the specified operation, KC receives the results and sends 
them to KFS. KFS reformats the results into UDM format and sends the 
transformed results back to LIL. LIL completes the process by sending the 
results back to the user. 

KDM, KDL, and KDS correspond to the attribute-based model, the 
attribute-based data language, and the multi-backend database system, 
respectively. In the following three sections we expiore and examine the 
attribute-based model, the attribute-based data language, and the multi-backend 


database system. 
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UDM — : User Data Model 
IQ DYE : User Data Language 


JOY. : Language Interface Layer 
KMS : Kernel Mapping System 
ee : Kernel Controller 


ers : Kernel Formatting System 
KDM ~~: Kernel Data Mode] 

KDL : Kernel Data Language 
KDS : Kernel Database System 


Figure 2.1 The Multi-lingual Database System (MLDS) 


A. THE ATTRIBUTE-BASED DATA MODEL 

The data structures of the attribute-based data model include: database, 
file, record, attribute-value pair, keyword, attribute-value range, directory 
keyword, non-directory keyword, directory, record body, predicates, and query. 
A database consists of a collection of files. Each file contains a group of records 
characterized by a unique set of directory keywords. A record has two parts. 
The first part is a collection of attribute-value pairs or keywords. The attribute 
of an attribute-value pair defines a specific quality or certain characteristics of 
the value, e.g., in the example of file "Person" below, the "name" is an attribute. 
Each record can only have one value associated with a corresponding attribute in 
the attribute-value pair. Further, no ‘two attribute-value pairs have the same 
attribute in a record, i.e., all attributes are distinct in a record. Certain 
attribute-value pairs of a record are directory keywords since their attribute 


values or attribute-value ranges are kept in a directory, e.g.. <FILE, Person> 
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may be in the directory where "FILE" is the attribute and "Person" is the value 
of the attribute. The rest of the attribute-value pairs are non-directory keywords. 
The second part of the record is the record body which contains only textual 


information. An example of a record in the "Person" file is as follows: 


(<FILE, Person>, <NAME, Philip Goisman>, <SSN, 209327902>, 
{ Major }) 
Note that each attribute-value pair contains only one value; and all the attributes 
are distinct. The textual information of the record is contained within the curly 
braces, i.e., the record body. 
A keyword predicate of 3-tuples consists of an attribute, a relational operator, 
and an attribute value, e.g., (NAME = Philip Goisman). A query combines 


keyword predicates in disjunctive normal form. The following is a query. 


((FILE = Person) and (NAME = Philip Goisman)) or 
((FILE = Person) and (NAME = Mark Gross)} 


Without any confusion we note that we use parenthesis for both records and 


predicates. 


B. THE ATTRIBUTE-BASED DATA LANGUAGE 

The attribute-based data language (ABDL) [Ref. 2, 12] is defined in this 
section. ABDL supports the five primary database operations, INSERT, 
DELETE, UPDATE, RETRIEVE, and RETRIEVE-COMMON. A request in 
ABDL is a primary operation with a qualification. A qualzfication specifies the 
part of the database on which the request operates. Two or more requests 
grouped together form a transaction. In the remainder of the section each 
request will be discussed and illustrated with an example. 

The INSERT request inserts a new record into the database. The 
qualification of an INSERT request is a list of keywords and a record body. The 
following example illustrates the INSERT request. 

INSERT (<FILE, Person>, <NAME, Philip Goisman>, 


<SSN, 209327902>, <GRADE, 3.0> 
{student }) 
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This request inserts Philip Goisman, his social security number and a comment 
that he is a student into the Person file. 

A DELETE request removes one or more records from a database. The 
qualification of a DELETE request is a query. The following example illustrates 
the DELETE request. 


DELETE ((FILE=Person) and (NAME=Philip Goisman)) 


This request removes all those named Philip Goisman from the Person file. 

The Update request modifies records of the database. The qualification of an 
UPDATE request consists of two parts, the query and the modifier. The 
modifier specifies how the records being modified are to be updated. The 
following example illustrates the UPDATE request. 


UPDATE ((FILE=Person) and (NAME=Philip Goisman)) 
(GRADE = GRADE + 1.0) 


This request updates the grade one point higher for all persons named Philip 
Goisman. The query is ((FILE=Person and (NAME=Philip Goisman)) and the 
modifier is (GRADE = GRADE + 1.0) 

A RETRIEVE request retrieves records from the database. The qualification 
of a retrieve request consists of a query, a target-list, and an optional by-clause. 
The query specifies which records are to be retrieved. The target-list consists of a 
list of output attributes. It may also consist of an aggregate operation, i.e., AVG, 
COUNT, SUM, MIN, MAX, on one or more output attributes. The optional by- 
clause may be used to group records when an aggregate operation is specified. 
The following example illustrates the RETRIEVE request. | 

RETRIEVE ((FILE=Person) and (NAME=Philip Goisman)) 

(GRADE) by SSN 
This request lists the grades by social security numbers of all persons named 
Philip Goisman. The query is (FILE=Person) and (NAME=Philip Goisman). 
the target-list is GRADE, and the by-clause is by SSN. 

The last request, RETRIEVE-COMMON, is used to merge two files by 
common attribute-values. Logically, the RETRIEVE-COMMON request can be 


considered as two retrieve requests that are processed serially in the following 
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form. 


RETRIEVE (query-1)(target-list-1) 

COMMON (attribute-1,attribute-2) 

RETRIEVE (query-2) (target-list-2) 
The common attributes are attribute-1 (associated with the first retrieve request) 
and attribute-2 (associated with the second retrieve request). The following 


example of RETRIEVE-COMMON from a population census example will 


illustrate this request. 


RETRIEVE ((FILE=CanadaCensus) and 
(POPULATION >= 100000)) (CITY) 

COMMON (POPULATION, POPULATION) 

RETRIEVE ((FILE=USCensus) and 
(POPULATION >= 100000)) (CITY) 


This example finds all the records in the CanadaCensus file with population 
greater than 100,000, finds all the records in the USCensus file with population 
greater than 100,000, identifies records of respective files whose population figures 
are common, and returns the two city names whose cities have the same 
population figures. ABDL provides five seemingly simple database operations, 
which are nevertheless capable of supporting complex and comprehensive 


transactions. 


C. THE MULTI-BACKEND DATABASE SYSTEM (MDBS) 

MLDS allows users to work with their favorite data language without having 
the database system which supports the data language physically present on the 
same computer. However, a conventional database system which supports that 
data language requires the system to operate on the mainframe in competition for 
memory and peripheral resources with all other database systems and their 
applications. Although a conventional database system does not need to 
translate transactions or transform databases it nevertheless, competes with all 
other database systems and applications such as SQL/Data System and SQL 
transactions. Thus, the degradation of their performance is due to many 
database systems and applications that operate concurrently on the same 


computer. An obvious solution to improving performance of all the database 
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systems and their applications is to reduce the amount of resource sharing cost- 
effectively. This solution is to offload the database-system software from the 
mainframe computer to a separate, dedicated computer with its own disk system. 
This is called the software single-backend approach. 


Bell Laboratories has been first to adopt this idea in developing a database 


system known as XDMS. XDMS has the goal to: 


(1) obtain a cost saving and a performance gain through specialization of 
the database operations on a dedicated backend processor, 


(2) allow the use of shared databases [by different mainframe computers, 
now called hosts], 


(3) provided centralized [i.e., physical] protection of the databases, and 


(4) reduce the complexity when developing software for a stand-alone and 
new machine. 


It has been determined that single backends are cost-effective. However, they do 
not improve the performance of database systems entirely. Instead, the single 
backend can become the bottleneck with increased use of the database system. 
The problem of performance degradation due to heavy I/O usage still remained. 
Dr. Hsiao’s group developed an approach to this problem called the 
software multi-backend approach. The multi-backend database system (MDBS) 
uses a parallel architecture of backends controlled by a single backend controller. 
(See Figure 2.2.) Backends are individual database. processors, each with its disk 
system consisting of at least a disk controller and 1 or more disk drives. The 
backend controller supervises the execution of the database transactions and 
interfaces with the hosts and users through three levels of menus. Level 1 is the 
system level which allows various configurations of multiple backend database 
systems to be generated and initiated. Level 2 permits the main actions of the 
user and test interfaces. These actions are generating a database. loading a 
database, and executing the request interface module, which leads to level 3. 
Menus in level 3 allow the user to choose a new database. create a list of 
transactions, modify a list of transactions, select a list of previously defined 


transactions, or display the results from a list of transactions. 
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Figure 2.2 The Multi-Backend Database System (M{BDS) 


When generating or loading a database, the user data is received by the 
controller and evenly distributed via a communications bus to all of the backends 
in accordance with a round-robin-like algorithm. During the database loading 
process, the controller develops a directory in each backend to record the physical 
location of every record in the database, i.e., on the backend’s disk system. The 
directory of each backend is to be used to locate records for transaction 
processing. 

Each backend develops its own queue of transactions for processing and 
performs its own database operations. This process is enabled by replicating all 
the backend software and most of the directory information on each backend. 


All transactions are sent to each backend. Directories are identical except for the 
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individual addresses of records (record-ids). Hence, as a backend finishes 
processing a transaction it sends the transaction responses back to the controller 
by way of the communications bus. 

MBDS has been designed to achieve two performance goals. Goal one 
states that if we increase the number of backends, while the size of the database 
and the size of the responses to the transactions remain constant, we can produce 
a reciprocal decrease in the response times of the user transactions. Goal two 
states that if we increase the number of backends proportional to the increase in 
the size of responses to the transactions, we can produce invariant response times 
for the user transactions. 

The first goal demonstrates increased performance in comparison to a 
single-backend system. The second goal sets out to prove that maximum 
performance may be maintained even with heavier usage. These design goals also 
comply with the goal of MLDS, 1.e., MLDS must be high-performance with 
expandable hardware. The interested reader is referred to |Ref. 7, 13, 14) for a 


thorough and comprehensive discussion of the concepts and capabilities of 


MBDS. 
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II. DAPLEX 


Daplex is a database language that was created by David W. Shipman [Ref. 
15] in 1979 while working at the Computer Corporation of America (CCA) and 
the Massachusetts Institute of Technology (MIT) [Ref. 16]. Its foundation is in 
what Shipman and Gray [Ref. 16] call the functional data model. One of 
Shipman’s goals for Daplex is to provide a "conceptually natural" database 
interface language. That is, the Daplex constructs used to model real-world 
situations are intended to closely match the conceptual constructs a human being 
might employ when thinking about those situations. Such conceptual 
naturalness, to the extent it has been achieved, presumably simplifies the process 
of writing and understanding Daplex requests, since the translation between the 
user’s mental representation and its formal expression in Daplex is more direct. 

Gray notes that Shipman developed his concepts from the semantic net used 
in artificial intelligence. The semantic net is a structure that represents 
associations between objects. For each object of a given type, there is a 
corresponding collection of functions which are applicable to it; some of these 
provide simple values, but the results of others are found by following ‘arcs’ in 
the net, which connect the object to other objects of various types. Functions 
can be applied in turn to these objects, thus exploring a network of associations. 
Consequently, Shipman’s Daplex relies on functions or functional composition to 
derive actual values. 

However, Daplex, as implemented by CCA [Ref. 17, 18], rests more firmly on 
the entity-relationship model originated by Chen [Ref. 19] and further described 
by Ullman [Ref. 20] than on Shipman’s functional model. We are, therefore, 
focusing on the entity-relationship model. When application of functional 
concepts are used, we then note the application. 

Throughout this thesis, references to CCA’s sample database, sample queries, 
and graphical database representation will be used in analyzing theoretical 


concepts or developing design issues. CCA’s sample university database and the 
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graphical representation of the university database are presented in Appendix A. 
Other examples of the university database will be used appropriately in the 
chapters as they are needed. 

In the remainder of this chapter, the entity-relationship model is first to be 
presented to provide the basis of understanding Daplex. Then a Daplex overview 
is to be presented for the purpose of familiarizing the reader with the Daplex 
schema and transactions which are involved in the transformation and translation 


of Daplex into ABDL. 


A. THE ENTITY-RELATIONSHIP (E-R) DATA MODEL 
One of the goals of Chen’s entity-relationship (e-r) model is to present a 
logical view of data. This is an important issue for any DBMS. In 


conceptualizing any DBMS many questions arise. Two significant questions are 


[Ref. 20]: 


(1) What are appropriate data structures with which to implement a phy- 
sical database? 


(2) What are the properties of typical data and how should it be 
represented by physical structures? 


The e-r model attempts to answer these questions by adopting the more natural 
view that the real world consists of entities and relationships. The e-r model can 
achieve a high degree of data independence ( i.e., independent from 
implementation considerations) and is based on the set theory and the relation 
theory. According to Chen the reader may view the e-r model as a generalization 
or extension of existing models. 


In developing the e-r model Chen identified four levels of logical views of 


data which apply to the e-r model. They are: 


(1) Information concerning entities and relationships which exist in our 
minds. 


(2) Information structure--organization of information in which entities 
and relationships are represented by data. 


(3) | Access-path-independent data structure--the data structures which are 
not involved with search schemes, indexing schemes, etc. 


(4) Access-path-dependent data structure. 


At the first level we consider entities and relationships. Basically, an 
entity is a thing and is distinguishable. A specific person, company, or event is 
an example of an entity. Entities are classified into different entity sets such as 
the sets named Person, Employee, and Student. To belong in an entity set, an 
entity has the properties common to the other entities in the entity set. Among 
these properties is a test predicate associated with each entity set to test whether 
an entity belongs to it. For notational purposes let e represent an entity that 
exists in our minds and E, denote the ith entity set. 

A relationship is an association among entities. For example, husband- 
wife is a relationship between two person entities. Relationship is more formally 
defined in terms of its presence in a relationship set.: A relationship set, R,, is a 


mathematical relation among n entities each taken from an entity set: 


{(e; , e,, ...,e,) | e, in E,, eg in Eo, ...,e, in E, i 


and each tuple of entities, ( e;, e2, ..., e,), is a relationship. Intuitively, the role 
of an entity in a relationship is the function that the entity performs in the 
relationship. 

The information about an entity or a relationship is obtained by 
observation and measurement, and is expressed by a set of attribute-value pairs. 
Attributes are the common properties needed. to classify entities and relationships. 
An attribute can be formally defined as a function which maps from an entity set 


or a relationship set into a value set or a Cartesian product of value sets: 


Values are the qualities of attributes that distinguish entities. Examples of 
values are Philip, Goisman, 41, 209-32-7902. For the entity set "Person," the 
above values are values for the Person attributes: first-name, last-name, age, and 
social security number. Note that an attribute is defined as a function. Also, 


note that both entities and relationships have attributes. 
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The entities, relationships, and attributes at level 1 are conceptual objects 
in our minds. At level 2, we consider representatives of conceptual objects. To 
organize information associated with entities and relationships, Chen separates 
the information about entities from the information about relationships. One 
method of distinguishing amongst a class of entities or relationships is through 
the use of primary keys or keys. Primary keys uniquely identify an entity in an 
entity set through an attribute or set of attributes of that entity set such that the 
mapping from the entity set to the corresponding value(s) of the attribute(s) is 
one-to-one. If we cannot find such one-to-one mapping, we may define an 
artificial attribute and its value so that such mapping is possible. In Chapter 5 
we are going to see that this is precisely the solution we use to transform Daplex 
data to ABDL data. The artificial attribute we define is a unique key for each 
entity in its corresponding entity set. For example, in the entity set Person. we 
define the attribute "Person.key" which is a unique number for each entity. 

Since a relationship is identified by the involved entities, the primary key 
of a relationship can be represented by the primary keys of the involved entities. 
In certain cases, the entities in an entity set cannot be uniquely identified by 
values of their own attributes; thus we must use a relationship(s) to identify 
them. Theoretically, any kind of relationship may be used to identify entities. 
CCA’s graphical representation of their university database in Appendix A uses a 
one-to-many mapping in which the existence of many entities on one side of the 
relationship depends on the existence of one entity on the other side of the 
relationship. As Chen states, "This method of identification of entities by 
relationships with other entities can be applied recursively until the entities 
which can be identified by their own attribute values are reached." 

In using relationships to identify entities, Chen identifies two forms of 
entity relations. If relationships are used for identifying the entities. they are 
called weak entity relations. Ullman names weak entity relations built-in 
relationships or tsa relationships. They are characterized directly by the 
hierarchical structure amongst entity sets. For example, CCA's graphical 
representation in Appendix A denotes these as isa relationships. In this example 


the entity set "student" forms an isa relationship with the entity set "person." 
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Throughout this thesis we use the term built-in or isa to denote weak entity 
relations. If relationships are not used to identify entities, they are called regular 
entity relations. Theses relations usually represent themselves, i.e., provide labels 
or names for themselves. For example, in CCA’s graphical representation of their 
university database in Appendix A, the entity set "course" is a regular entity 
relation. Similarly, there are two forms of relationship relations. If all entities in 
the relationship are identified by their own attribute values, they are called 
regular relationship relations. For example, in CCA’s graphical representation of 
their university database in Appendix A, the entity set "person" is a regular 
relationship relation. If some entities in the relationship are identified by other 
relationships, they are called weak relationship relations. Ullman names weak 
relationship relations user-defined relationships. They are characterized by their 
roles between entities. For example, in CCA’s graphical representation of their 
university database in Appendix A, in the entities for entity sets "student" and 
"support staff," the attributes "enrollments" and "supervisor" specify user- 
defined relationships. Throughout this thesis we will use the term user-defined to 
denote weak relationship relations. When distinctions are required between 
identifications of the relations they will be noted. 

Level 3 and level 4 denote the e-r model using a distinct diagrammatic 
technique. This diagrammatic technique relies on rectangles, diamonds, and arcs 
developed by Chen. CCA's graphical representation in Appendix A drops this 
notation. However, it can be seen that if relationships were enclosed in diamonds 
in the CCA representation, that representation would take on the e-r diagram 


format. 


B. AN OVERVIEW OF DAPLEX 

Daplex is a database language used to access and manipulate data 
modeled in an entity-relationship like model. Database languages usually consist 
of two different parts, the data definition language (DDL), also, called the 
schema, and the data manipulation language (DML). This is also true of Daplex. 
This section introduces the primitive objects, the properties of the objects, and 
the relations among the objects in the DDL portion of Daplex. Also introduced 


are the primary operators of the DML portion of Daplex which is used to 
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manipulate the primitive objects in the DDL. The purpose of this overview is to 
introduce the terms and concepts of Daplex in order to understand the 
transformation and translation of Daplex into ABDL. (See Chapters 4 and 6.) 
Informal definitions of some of the above terms follow. They will be 
formally defined later in this section. The primitive objects are databases, entity 
types, and nonentity types. The properties of objects are the attributes and 
constraints. Intuitively, constraints specify legal values for the prirnitive objects. 
The relations among the objects are the generalization hierarchies. A hierarchy is 
a group of things that are in tree-like order from the base of the tree, called the 
root, to the leaves of the tree, called terminals. An individual thing in a tree is 
called a node. In between and inclusive of the root and the terminals may be 
nodes classified in relationship to their location in the tree as ancestors, 
descendants, types, subtypes, supertypes, parents or children. Informally, 
generalization hierarchies in Daplex reflect isa relationships where some entity 
types are subtypes of a more general entity type. For example, in the Person 
Generalization Hierarchy for the University Database Schema, the entity set 
"oraduate” isa entity set "student" where "student" is the more general entity 
type. For a formal definition on hierarchies see Weishar’s thesis [Ref. 3]. The 
primary operators which manipulate the primitive objects are the FOR EACH 
statement, the ASSIGNMENT statement, the CREATE statement. the 
INCLUDE statement, the EXCLUDE statement, the DESTROY statement, the 
MOVE statement, and the PROCEDURE CALL statement. 
1. | The DDL Portion Of Daplex 
Of Daplex, the data definition language (DDL) portion consists of a 
database schema of a related collection of the primitive objects, the properties of 
the objects, and the relations among the objects. An example of a database is 
the university database in Appendix A. The basic format of a database schema 


is as follows: 


DATABASE db name IS 
[ nonentity type declarations | 
entity type declarations 
| entity type constraints | 


END | db name ]; 


where: 
db name is the unique name of the database 


optional nonentity_ type declarations are string 
types, scalar types, and numeric constants 


entity type declarations are the declarations 
of entity types, their attributes, and the 
generalization hierarchies. 


optional entity type constraints are the 
properties of the declared entity types that 
must remain invariant under any operations 
on values of those types. 

Throughout this thesis square brackets ({,]) denote optional 
declarations. The database schema may be intermixed in any order. However, all 
types must be completely or partially declared before the name may appear in 
another declaration. 


Examples of entity types are also provided in Appendix A in the 


university database example. The basic formats of an entity type are as follows: 


(1) TYPE entity type name IS 
eeNeb iy 
[attribute name 1: attribute type; 
attribute name 2: attribute type; 


attribute name n: attribute type;] 
END ENTITY: 


(2) TYPE entity type name; 


where: 
entity type name is a unique name in 
the database 


attribute names are lists of one or 
more unique properties of an entity; 
if more than one name, then each 
attribute shares the same type 
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attribute types may be strings, 
scalars (integer, floating-point, 
or enumeration type), entities, 
nonentities or sets of any of 
the above types 

The second entity-type definition is a partial entity-type 
declaration. Partial declarations make entity names available as attribute types. 
An attribute type must be declared, completely or partially, before it can be 
referenced. A reference of entity types in this manner provides the operational 
link in implementing the user-defined relationships of the e-r model in Daplex. 
Whenever a partial declaration appears in an entity-type declaration. the 
complete declaration of that entity type must appear later in the sarne database 
declaration. 

The relations among the objects in Daplex are reflected in the 
generalization hierarchies. All types in the generalization hierarchy are entity 
types or entity subtypes. An example of the Person generalization hierarchy is 
presented in Appendix A. The example shows that the entity type, Person, 
forms the root of a tree of built-in relationships. The nodes of the tree are entity 
subtypes. The names of subtypes must be unique. Subtypes are descendants of 
the root type or other subtypes. As the hierarchy is traversed downward from 
level to level, each subtype inherits all of the attributes of its supertypes. 
Supertypes are ancestors of subtypes. The highest level ancestor is the root type. 
The general formats of a subtype are as follows. 

(1) SUBTYPE subtype name IS supertype_ names 
ENT Ty 


[attribute names 1: attribute type; 
attribute names 2: attribute _type; 


attribute names n: attribute type:| 


END EN PITY: 


(2) SUBTYPE subtype name; 


Supertype names is a list of one or more names of entity types and 
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subtypes of built-in relationships completely declared previously in an entity type 
declaration. Attribute names and types are the same as for entity types. The 
two subtype formats also correspond directly to the entity type formats. One 
important point to remember is that the complete declaration of subtype creates 
a built-in relationship. 

Nonentity types describe the primitive objects that declare data 
types other than entities. There are three nonentity types: the base type, the 


subtype and the derived type. Their formats are as follows. 
base type: TYPE type name IS type definition; 
subtype: SUBTYPE subtype name IS prev_name; 


derived type: TYPE name IS NEW prev_name 


The type definition declares the data types of the nonentities. They 
may be user-defined or predefined. The predefined types are STRING, — 
INTEGER, FLOAT, and BOOLEAN. User defined types are strings, 
scalars(integer, floating-point, enumeration, and boolean), and numeric constants. 

There are two types of constraints on entity types in Daplex. They 
are the overlap constraint and uniqueness constraint. An overlap constraint 
determines when an entity may legally belong to more than one terminal entity 
subtype. A terminal subtype is a leaf in the generalization hierarchy. Terminal 
types are disjoint unless they are connected with the overlap constraint. The 


general format of the overlap constraint is as follows. 


OVERLAP entity type names WITH entity type names; 


Entity type names are unique terminal subtypes. An example of the overlap 
constraint is in the university database example in Appendix A. 

The uniqueness constraint specifies, for a particular entity type or 
subtype, a collection of attributes whose values are unique for all entities in a 
database belonging to that type or subtype. Several examples are provided in 
the university database example in Appendix A. The general format of the 


uniqueness constraint is as follows. 
UNIQUE attribute name WITHIN entity type name; 
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A restriction for values of attributes is that uniqueness constraints only apply to 
values directly. Attributes that derive their values from relationships with 
entities or nonentities are precluded from forming uniqueness constraints. 

The Daplex DDL is to be discussed again in chapter 4 where 
transformations between Daplex and MLDS structures are presented. 

Ze The DML Portion Of Daplex 

The data manipulation language (DML) of Daplex consists of many 
statements. We discuss each of the statements in this section. The FOR EACH 
statement is used to retrieve data from the database. The ASSIGNMENT 
statement is used to modify attribute values in the database. The CREATE 
statement is used to insert new data into the database. The INCLUDE 
statement is to permit additions of attribute values to set valued functions 
(attributes). The EXCLUDE statement is used to remove an attribute value or 
group of attribute values from a set valued function. The DESTROY statement 
is used to remove data from the database. The MOVE statement moves a record 
or a group of records with the same type conditions from one set of subtype files 
into another set of subtype files. The PROCEDURE CALL statement allows the 
user to format output from transaction requests. The exact syntax and semantics 
of each of the primary operations is thoroughly examined in Chapter 6 when the 
translation process from Daplex to ABDL is explained. » 

An important point is to recognize that, in using the primitive 
operations, parenthetical functional representations of the primitive objects are 


used. Some examples are as follows. 


name(s) 

gpa(u) 
The attributes, name and gpa, are examples from the university database. The 
variables, s and u, apply to the entities, student and undergraduate, respectively. 
Notice that the value of the attribute, name, in the entity set Person 1s given to 
the entity set, Student, while the value of the attribute. gpa, in the entity set. 
Undergraduate, is self-contained. The functional composition is applied to values 
of attributes frorn user-defined relationships. The following example from the 


university database illustrates functional composition. 
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name(advisor(s)) 


As before, s represents the student entity. However, the attribute, advisor, in the 
entity set, Student, receives its value through its user-defined relationship with 
the faculty entity. The entity set, Faculty, in turn, inherits its value for the 
attribute, name, from its built-in relationship with its supertypes employee and 
person. The built-in relationships are illustrated in the person generalization 
hierarchy schema in Appendix A. The user-defined relationships are illustrated 


in the graphical representation of the schema, also in Appendix A. 
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io Ue oURUCT Unio NECKSOARY TO"EXECUTE DAPLEX 


To translate Daplex transactions to ABDL, Daplex data structures must also 
be transformed into ABDL data structures. Once Daplex transactions are 
received they may be translated into their corresponding ABDL equivalent 
constructs. The ABDL constructs have been presented in Chapter 2, Section A. 
Here we focus on the data structure transformation and transaction translation. 

In transforming data structures from Daplex to ABDL, the attribute-based 
(a-b) record structure has been created with an additional attribute-value (a-v) 
pair. Justification for this addition is presented in Chapter 5 on transforming the 
Daplex schema into a corresponding ABDL schema. 

The following discussion on data structures describes the methodology used 
to arrive at the final design of the data structures and the design and 


methodology of data structures for Daplex. 


A. A METHODOLOGY FOR THE DESIGN OF DATA STRUCTURES 

The method for designing Daplex data structures has been provided by 
intuition gained through the use of the declaration section from CCA’s sample 
university database schema and its graphical representation in their users 
manual. As presented in Chapter 3, a database is a collection of related data 
usually of several different types. The related data types, also presented in 
Chapter 3, are described in terms of relationships and entities. The database 
schema is the data definition of the types for relationships and entities. The 
physical representation of the schema in computer memory is via the data 
structure. The sample university database and the graphical representation are 
reproduced in Appendix A. 

The goal is to apply MacLennan’s abstraction principle [Ref. 21]. That is to 
identify and abstract useful, frequently recurring patterns of data into data 
structures that would represent the database schema. In representing the 


database it is necessary to provide entries in the Daplex data structures that 
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consider the primitive objects, the properties of the objects, and the relations 
among the objects. The data structure that applies most closely to the Daplex 
schema is the list. Thus, in the design of Daplex structures the list data 
structure has been applied across al] of Daplex’s primitive objects, properties of 
the objects, and relations among the objects in the Daplex schema. 

The university database has, also, helped to provide an example of an overall 
picture of how relationships between entities are structured within a Daplex 
database or an e-r model]. This example is depicted in Figure 4.1. This depiction 
presents the set-theoretic concepts that support Daplex and the e-r model. 


Daplex and the e-r model rely heavily on the set theory in their database design. 







Person 


Seiden c 


Figure 4.1 Set-Theoretic Depiction of Sample Daplex Database 


PeeobotGN OF DATA STRUCTURES FOR DAPLEX 

In the construction of the Daplex data structures, presented in Appendix B. 
it has finally been resolved, through trial-and-error, to provide a_ top-level 
database node that references or points to a list of all possible types, subtypes. 
and derived types of entities and nonentities. This top-level node is called the 
ent dbid node. Each type has its own list in the database. 

Entity types form lists of entity types, using the ent node structure, with 
individual sub-lists, using the function node structure, of their attributes and 
constraints. Subtypes, using the gen sub node structure, form lists of entity 
subtypes, sub-lists of their individual supertypes (entity types or subtypes), sub- 
lists of their individual overlap types, using either the overlap sub node or the 
overlap ent node structures, if applicable, and individual sub-lists of their 
attributes and constraints, again using the function node structure. Attributes of 
supertypes are inherited by subtvpes. The inheritance is provided by the list 
structure from subtype to supertype. 

Attributes may be completely specified or related to other entities or 
nonentities with single or multiple values of a single type. To allow for different 
attribute data types and the potential for multiple values, a sub-list structure is 
also created to point to the value or values of each attribute. This sub-list 
structure is the function node to be described below. To allow for one or more 
values of one attribute, the attribute value is provided by another sub-list using 
the ent_ value structure. 

Each nonentity creates separate lists of types, subtypes. and derived types. 
In one of the first data structure designs, the structures for subtypes and derived 
types have created sub-lists of their supertypes in order, back to the root. The 
sub-list to point back to the root has been eliminated due to the complexity 
involved in the implementation of circular pointers. Since the values of nonentity 
types are completely specified, only lists of values needed to be maintained. 
instead of the list of function nodes required by entity types. Thus a separate 
value node is created to list single or multiple values of these types. The 
ent value structure can be used directly from the nonentity types to store the 


value(s) of nonentity types. This contrasts to the indirect method of storing 
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entity values. As a by-product of designing the nonentity type structure, it is 
discovered that this structure also worked for the constant declaration. The 
constant declaration only requires storage for its name and value, both of which 
are provided in the ent _non_node structure. 

To handle completely specified values and relationship values in the entity 
types, a separate construct, called a function node, is developed. This construct 
provides for all the possible values, relationships, and constraints that a function 
could possibly have in an entity. Note, that only one structure in the 
function node is to be used to define an attribute value at any one time. 
Multiple values of attributes are also provided by using sub-lists of the ent value 
nodes or sub-lists of a specific relationship type. 


The final Daplex data structures are presented in Appendix B. 
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V. THE TRANSFORMATION OF DAPLEX SCHEMA INTO ABDL SCHEMA 


This chapter provides the specifications for transforming the Daplex Schema 
into its corresponding ABDL schema. In the first section of this chapter we 
present the methodology and background which led us to this specification. 


Finally, we present the specification. 


A. THE BACKGROUND AND METHODOLOGY 

The method we have used to derive our specification for the transformation 
of Daplex schema into ABDL schema is first to compare our knowledge about the 
Daplex and ABDL constructs. We then construct all the ways we fit or map the 
Daplex constructs into the ABDL constructs. The following discussion presents 
our findings. In section B we present our final specification. 

Given the a-v pairs in a record in ABDL, it is immediately obvious that the 
functions of Daplex correspond directly to the attributes of ABDL. However. 
there are difficulties. First, values in the a-v pair can only be single-valued 
whereas Daplex permits single- and multiple-valued attributes or sets of values. 
Second, ABDL has no direct way of representing relationships between individual 
records. 

The first problem can be resolved by repeating records for multiple-valued 
attributes. The number of records required is the Cartesian product of the 
number of multiple-valued attributes, if only individual records are involved, te.. 
relationships between records are not involved. (Storage of many records is not a 
problem since, in MBDS, each backend has sufficient storage.) 

In one mapping relationships are resolved by just repeating all the related 
attributes for each related record. The repeating of related attributes provides 
for the unique requirement of relationships in the e-r model, and is simple to 
implement. As noted, the storage capacity of MBDS is capable of handling 
repetition. 

The problem with this design becomes apparent when applying it to a real 


example. It is shown that records with multiple values multiply the number of 


39 


records. Records with relationships containing multiple values grow 
exponentially according to the depth they appear in the relationship tree in 
which their actual values appear. The mapping which provides the best solution 


to the two difficulties above is presented next in the specification. 


B. THE SPECIFICATION 

In this solution repeating records for multiple-valued attributes is retained. 
Relationships between records of different files are transformed using Chen’s 
solution of defining an artificial attribute and its value so that a unique mapping 
is possible. This has been described in Chapter 3. The artificial attribute we 
have defined is a unique key for each entity in its corresponding entity set. For 
example, in the entity set Person, we have defined the attribute "Person.key," 
which has as its value a unique number for each entity. Thus, a unique a-v pair 
or keyword, consisting of the entity key and its value, is provided for each entity 
type or subtype. The value of this unique keyword is the "link" between entities. 
types or subtypes, in a relationship, i.e., entities are related only in accordance 
with the unique key. To retrieve attribute values in an entity type or subtype, 
dependent on a relationship, we reference the corresponding unique keyword 
directly. 


As a consequent of this solution an algorithm to transform the Daplex 


schema into its corresponding ABDL schema is provided as follows: 


(1) Separate the entity types and subtypes from the nonentity types, sub- 
types, and derived types in the Daplex schema. 


(2) For each entity type or subtype create an ABDL file for the name of 
each entity type or subtype. In the first a-v pair the attribute is 
spelled "File" and the value is the name of the entity type or subtype 
from the declaration or data definition section of the database schema. 


(3) For each ABDL file that is an entity type, add the unique keyword as 
the second a-v pair of each record. The attribute consists of the file 
name followed by a dot and completed with the attribute, key. The 
value is a distinct number for each record in the file. 
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(3a) For each ABDL file that is an entity subtype add the unique keyword 
as the second a-v pair of each record. The attribute consists of file 
name followed by a dot followed with the attribute, key, followed by a 
left parenthesis, recursively followed by the related entity subtype 
file’s (if any) keyword attribute and left parenthesis, until the related 
entity type is reached. At this time the keyword for the entity type is 
added followed by the number of right parenthesis required to balance 
the number of left parenthesis in the entity subtype keyword. 


(4) For each attribute (function) in an entity, add an a-v pair to its 
corresponding record in its ABDL file. The attribute of the a-v pair is 
the name of the attribute (or function). The value of the a-v pair is, 
of course, its value. 


(4a) For multiple values or set valued functions, multiple records need be 
created to account for all the values of an attribute. This requires the 
duplication of all a-v pairs of that record and the insertion of the next 
value of a set valued function. Attributes with set valued functions 
are initialized as null to prevent unwanted record deletion for empty 
sets. 


(4b) For attribute values that derive their values from relationships with 
entities, the attribute of the a-v pair consists of the attribute name. 
followed by a left parenthesis, recursively followed by a related entity 
subtype file’s (if any) keyword attribute and left parenthesis, until the 
related entity type is reached. At this time the keyword for the entity 
type is added followed by the number of right parenthesis required to 
balance the number of left parenthesis in the entity subtype keyword. 


(4c) For single valued attributes, with no relationship dependencies. insert 
its value in the second position of the a-v pair. 


Applying the above algorithm to the sample Daplex University database 
schema in Appendix A, we transform the Daplex University database schema 
into its corresponding ABDL schema. This transforination is depicted by the 
templates of the ABDL University database schema in Figure 5.1. Asterisks in 
the templates represent values dependent upon relationships. Other values in the 
templates represent the entity or the attribute type. Nonentity types are treated 
as user-defined types. Ternplates are general representations of an actual or fully 
specified database schema. Templates represent the schema without showing the 


fully specified representation of records for set-valued functions. The translation 


of the CREATE statement in Chapter 6 shows an example between templates 


and a fully specified representation. 


(<File, person>, <person.key, **>, <name, string>, 
<ssn, string = 000000000> ) 


(<File, employee>, <employee.key, <person.key, **>>, 
<home_ address, string>, <office, string>, 

<phones, set of string>, <salary, float>, 

<dependents, integer range>) 


(<File, support staff>, 

<support staff.key, <employee.key, <person.key, **>>>, 
<supervisor, <employee.key, <person.key, **>>>, 

<full time, boolean>) 


(<File, faculty>, 

<faculty.key, <employee.key, <person.key, **>>>, 
<rank, rank name>, <teaching, <course.key, ***>>, 
<tenure, boolean = FALSE>, 

<dept, <department.key, ****>>) 


(<File, student>, 

<student.key, <person.key, **>>, 

<advisor,<faculty.key ,<employee.key,<person.key,**>>>>, 
<major, <department.key, ****>>, 

<enrollments, <set of enrollment.key, *****>>) 


(<File, graduate>, 

<graduate.key, <student.key, <person.key, **>>>, 
<advisory committee,<faculty.key, <employee.key, 
<person.key,**>>>>) 





(<File, undergraduate>, 
<undergraduate.key, <student.key. <person.key, **>>>., 
<gpa, grade point>, <year, integer range 1 .. 4 := ie) 


(<File, course>. 

<course.key, ***>, <title, string> 

<dept, <department.key, ****>, 

<semester, semester name>, <credits, integer>) 
(<File, department>, <department.key, ****>, 
<head,<faculty .key,<employee.key,< person.key,**>>>>, 
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(<File, enrollment>, <enrollment.key, *****> 


<class, <course.key, ***>>, 
<grade, grade point>) 


? 


Figure 5.1 Template Of ABDL University Database Schema 
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VI. TRANSLATING DAPLEX TRANSACTIONS TO ABDL TRANSACTIONS 


This section describes the translation of Daplex transactions to their 
equivalent ABDL transactions. The previous design theses, referenced in Chapter 
1, referred to the translation process as a mapping operation. Translation and 
mapping are synonomous in the context of these theses. The Daplex transactions 
to be described involve the primary operators for looping and updating. The 
Daplex looping operator is the FOR EACH loop. The Daplex updating operators 
are the ASSIGNMENT, INCLUDE, EXCLUDE, CREATE, DESTROY, MOVE, 
and PROCEDURE CALL statements. Each of these expressions and their 
associated ABDL mappings is discussed below. 

ABDL statements, as stated in Chapter 2, are RETRIEVE, RETRIEVE- 
COMMON, INSERT, UPDATE, and DELETE. The manipulation of these 
ABDL statements for the purpose of mapping the Daplex statements becomes 
apparent in the examples below. Once again examples are used to gain the 
necessary intuition in order to formulate a general mapping for each Daplex 
expression. The examples are taken from CCA’s Daplex User's Manual [Ref. 18]. 
The CREATE statement is selected first because it seems natural to start with 
the creation of a database; and, to provide clues about similarities between the 
ABDL INSERT and the Daplex CREATE. 

Neither Daplex’s PROCEDURE CALL statement nor ABDL’s RETRIEVE- 
COMMON _ statement will be presented in this chapter. The 
PROCEDURE CALL which includes procedures like print and cancel are 
accommodated by the MLDS and ABDL operators. The RETRIEVE-COMMON 


is not used in any of the Daplex examples. 
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A. THE CREATE STATEMENT 

The CREATE statement is used to insert new data into the database. More 
specifically, the CREATE statement creates a new database entity. The general 
case of CREATE (Figure 6.1) consists of one or more previously declared entity 
types or subtypes. Just as in imperative programming languages like Pascal, 
specific data structures must be declared in the declaration section defining the 
specific database. If multiple types are listed (entity types or subtypes), each 
type must be a terminal type. 

The Daplex example in Figure 6.2 shows the creation of graduate and faculty 
subtypes. Both are terminal types, i.e., leafs in the relationship tree or the 
Generalization Hierarchy of Person in Appendix A. Also, shown, to the left of 
their corresponding functions, are nonterminal supertypes (types and subtypes in 
the relationship tree, e.g., person, employee, and student) of graduate and 
faculty. All functions receive values. They may be user-defined or default values. 
The function tenure in the entity faculty receives a default value. Functions with 
default values do not have to be specified. The system automatically assigns 
their values by looking them up in their declarations. It is possible to have all 
default values and thus no function names. Ancestors, such as these, require 
indirect creation whenever their descendents, (i.e., faculty and graduate), are 
created. Faculty and graduate are declared as overlapping entities (see Chapter 
3). Thus, they have the same attribute characteristics of person. 


The equivalent ABDL example in Figures 6.3a and 6.3b is a sequence of 


CREATE NEW entity type names 
(function name 1 => expression 1. 
function name 2 => expression 2. 


function name _n => expression _n)|: 


Figure 6.1 The CREATE in Daplex 


Ayl 


CREATE NEW graduate, faculty 


-- (name => "Jane Jones", 
person |__ ssn BBO Ue 

-- home address => "503 S. Atherton", 

| office => "104", 
employ | phones => {"4928860", "5327020"}, 

| salary => 5000.00, 

| dependents => 0, 

-- rank => assistant, 

| teaching => 

| {c IN course WHERE title(c) = {"CS2970","CS3111"}}, 
fac’lt | (tenure - default ) 

| dept => 

| _{d IN department WHERE dname(d) = "CS"}, 

-- advisor => 

| {f IN faculty WHERE name(f) = "MacLennan"}, 
stud't | major => 

| {d IN dept WHERE dname(d) = "CS"}, 

| enrollments => 


| {e IN enrollment WHERE title(e) = 
—_{"CS$3450", "CS4112", "CS4150"}}, 
grad |advisory committee => 
| {f IN faculty WHERE name(f) = {"Hsiao", "Davis"}}); 


Figure 6.2 A Daplex Sample for the CREATE 


non-hierarchical, fixed-length record types. These record types may be entity 
types or subtypes in accordance with the daplex schema. The thirteen 
statements in Figure 6.3a show the ABDL templates that correspond to the 
Daplex CREATE. These templates are the actual ABDL instructions to 
implement the CREATE. The disjunction in the ABDL RETRIEVE operation 
repeats the INSERT instruction until there are no more lookup functions to be 
retrieved. The nineteen statements in Figure 6.3b show the actual effect of the 
ABDL templates on the database. For example, it is shown that three separate 
records are created for the student entity and the enrollment entity for each 
course in which the student enrolled. Later statement mappings do not reflect 


the effect of ABDL templates. However, the reader should be cognizant of the 
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effect. Spaces are left between entities to reflect the corresponding mappings for 
each created entity. 

With the intuition gained from the above example, the general case of 
translating the CREATE statement in a sequence of ABDL operations can be 
formulated. As shown in Figure 6.4, the translation consists of a sequence of 
INSERT and RETRIEVE operations. The sequence is dependent upon the 
actual expression value, i.e., the value on the right hand side of the imply sign ( 
=> } in Figure 6.1. If the expression value is a single actual value, then only one 
INSERT is generated. If the expression value is a set of actual values. then 
multiple INSERTs are generated for multiple values. This is reflected by the 
function phones in the entity employee. Otherwise, expression obtains its value 
from relationships between entities. There are several instances of this in the 
example. One is reflected in the function enrollments in the entity student. In 
that case, a RETRIEVE, in disjunctive normal form, is used to return a key to 
the value of the expression. This is shown in the example in Figure 6.3b. 

Before beginning a CREATE the relationship tree must be traversed through 
all its ancestors to its ultimate ancestor, the root. Each ancestor, in addition to 
the file specified in the CREATE, must be created with the appropriate function 
values as described in the example. The ancestors and the specified file(s) then 
determine the number of INSERTs required for that CREATE transaction. The 
order that the functions occur in the CREATE transaction follows the prefix 
structure of their corresponding entity(s) in the relationship tree up until the 
terminal entity(s) specified. 

The general case of the corresponding ABDL CREATE in Figure C.1 of 
Appendix C is a mapping algorithm, not an execution algorithm. A mapping 
algorithm is a method for specifying all possible translations from a Daplex 
transaction to its equivalent transaction in ABDL. An execution algorithm is a 
method for implementing the specifications produced by the mapping algorithm 
so that the equivalent ABDL transaction may be run. Figures 6.3a and 6.3b are 
examples of an execution algorithm. 

After the file(s) or entity(s) have been located in the Daplex CREATE 


statement, the mapping algorithm for the CREATE evaluates the expressions for 
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1. INSERT(<File, person>, 
<person.key, next file seq numb= *>, 
<name, Jane Jones>, <ssn, 111223333>) 


/* Following two INSERTS on employee because phones 
are completely specified */ 

2. INSERT(<File, employee>, <employee.key,<person.key, *>>, 
<home_ address, 503 5. Atherton>, 
<office, 104>, <phones, 4928860>, 
<salary, 5000.0>, 
< dependents, 0>) 

3. INSERT(< File, employee>, <employee.key,< person.key, *>>, 
<home address, 503 S. Atherton>, 
<office, 104>, <phones, 5327020>, 
<salary, 5000.0>, 
< dependents, 0>) 


4. RETRIEVE(<File=course> and <title=CS2970> or 
<File=course> and <title=CS3111>) (course.key) 
5. RETRIEVE (<File=department> and <dname=CS>} 
(department.key) | 
6. INSERT(<File, faculty>, 
<faculty.key,<employee.key.<person.key, *>>>, 
<rank, assistant>, 
<teaching,<course.key, **>>. 
<dept,<department.key,***>>) 


=) 


. RETRIEVE(<File=person> and <name=MacLennan>)(person.key) 
8. RETRIEVE(<File=department> and <dname=CS>) 
(department.key) 

9. RETRIEVE(<File=course> and <title=CS3450> or 
<File=course> and <title=CS4112> or 
<File=course> and <title=CS4150>) (course.key) 

10. INSERT(< File.enrollment>,<enrollment.key,****>, 

<class,<course.key.** >>.<grade,0.0>) 

11. INSERT(<Filestudent>, <student.key,<person.key.**>>, 

<advisor,<faculty.key, 

<employee.key.<person.key,**>>>>, 
<major,<department.key,***>>, 
<enrollments,<enrollment.key,**>>) 


12. RETRIEVE(<File=person> and <name=Hsiao> or 
<File=person> and <name=Davis>) (person.key) 
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13. INSERT (<File,graduate>, 
<graduate.key ,<student.key,<person.key,”* >>>, 
<advisory committee,<faculty.key, 
<employee.key,<person.key,**>>>>) 


Figure 6.3a An Equivalent Example of CREATE in ABDL Templates 


The effect of requests executed in the templates will 


look like: 


1. INSERT(<File, person>, 
<person.key, next file seq numb= *>, 
<name, Jane Jones>, <ssn, 111223333>) 


2. INSERT(<File, employee>, <employee.key,< person.key, *>>. 
<home_ address, 503 S. Atherton>, 
<office, 104>, <phones, 4928860>, 
<salary, 5000.0>, 
< dependents, 0>) 
3. INSERT(< File, employee>, <employee.key,< person.key, *>>. 
<home address, 503 S. Atherton>, 
<office, 104>, <phones, 5327020>, 
<salary, 5000.0>, 
< dependents, 0>) 


4. RETRIEVE(<File=course> and <title=CS2970> or 

<File=course> and <title=CS3111>) (course.key) 

5. RETRIEVE (<File=department> and <dname=CS>) 

(department.key) 

6. INSERT(<File, faculty>, ; 
<faculty.key ,<employee.key,<person.key, *>>>, 
<rank, assistant>, 
<teaching, course.key -> CS2970>. 
<tenure, false>, 
<dept, department.key -> CS>) 

7. INSERT(< File, faculty>, 
<faculty.key,<employee.key,<person.key, *>>>, 
<rank, assistant>, 
<teaching, course.key -> CS3111>, 
<tenure, false>, 
<dept, department.key -> CS>) 


8. RETRIEVE(<File=person> and <name=MacLennan>)(person.key) 
9. RETRIEVE(<File=department> and <dname=CS>) 
(department.key) 
10. RETRIEVE(<File=course> and <title=CS3450> or 
<File=course> and <title=CS$4112> or 
<File=course> and <title=CS4150>) (course.key) 
11. INSERT(<File,enrollment>,<enrollment.key,****>, 
<class,<course.key,** >>.<grade.0.0>) 
12. INSERT(<File,student>, <student.key,<person.key,**>>, 
<advisor,<faculty.key, 
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<employee.key,< person. key,**>>>>., 
<major,<department.key,***>>, 
<enrollments,<enrollment.key,**>>) 
13. INSERT(< File,enrollment>,<enrollment.key,****>, 
<class,<course.key,** >> ,<grade,0.0> ) 
14. INSERT(<File,student>, <student.key,<person.key,**>>, 
<advisor,<faculty.key, 
<employce.key,< person.key,**>>>>, 
<major,<department.key,***>>, 
<enrollments,<enrollment.key,** >>) 
15. INSERT(<File,enrollment>,<enrollment.key,****>, 
<class,<course.key,**>>,<grade,0.0>) 
16. INSERT (<File,student>, <student.key,<person.key,**>>, 
<advisor,<faculty.key, 
<employee.key,< person.key,**>>>>, 
<major,<department.key,***>>, 
<enrollments,<enrollment.key,**>>) 


17. RETRIEVE(<File=person> and <name=Hsiao> or 
<File=person> and <name=Davis>) (person.key) 

18. INSERT (<File,graduate>, 
<graduate.key,<student.key,<person.key,**>>>. 
<advisory committee,<faculty.key, 

<employee.key,< person.key,**>>>>) 

19. INSERT (<File.graduate>, 
<graduate.key,<student.key,<person.key,**>>>. 
<advisory committce,<faculty.key, 

<employee.key,<person.key,**>>>>) 


Note: Multiple function values, fully specified (actual) or 
‘ referenced, create multiple records. 


Figure 6.3b An Equivalent Example of the CREATE in ABDL 


each function of an entity. If the function expression requires a reference to a 
value in a relationship with another entity, the algorithm retrieves the entity(s) 
and returns the reference (i.e., pointer) to its value(s). 

Sometimes it is necessary to create an entity just to be able to return a 
reference to a value. Such a case occurs with the entity type enrollment. Before 
the referenced value for attribute enrollments can be entered in the entity 


student, entity enrollment must be created for each course in which that student 


AT 


is to enroll. Finally, regardless of whether the functions of an entity receive actual 
or referenced values from their expressions, the entity is created. When the 
specified file(s) or entity(s) have been created the CREATE statement is 
completely mapped. 

Note that this mapping algorithm only specifies one error condition. There 
are three other types of errors possible: undeclared finenone in an entity, illegal 
function expression types, and illegal overlapping of entities. These errors are 
caught by the DML routine in Chapter 7 that checks for data-structure 
correctness. The point here is that these three types of errors are implementation 


errors that do not relate to the mapping process. 


B. THE DESTROY STATE ten? 

The DESTROY statement is used to remove data from the database. It 
reverses the effect of the CREATE statement. The corresponding ABDL 
statements are RETRIEVE and DELETE. The statement of DESTROY is as 


follows: 


DESTROY (entity valued expression); 


An entity valued expression is an entity-valued loop parameter, a single, entity- 
valued function expression, or an entity-valued set containing one member. The 
entity (record) is deleted from all types and subtypes to which it belongs and is 
completely removed from the database. 

An example is provided in Figure 6.4. The mapping algorithm is displayed in 
Figure C.2 of Appendix C. 


DESTROY ({f in faculty WHERE ssn(f) = "111223333"}: 
The following is the execution algorithm: 


Search entity types and subtypes for 
function expressions for 
faculty. key 
Return file = student, file = graduate, 
file = department: 
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RETRIEVE (<File = person> and <ssn = "111223333">) 
(person.key) 
for each (person.key) 
if not (RETRIEVE (<File = student> and <advisor = *>) 
(student.key)) or 
not (RETRIEVE (<File = graduate> and <advisor = *>) 
(graduate.key)) or 
not (RETRIEVE (<File = department> and <head = *>) 
(department.key)) 
then 
DELETE (<File = faculty> and 
<faculty.key<employee.key<person.key = *>>>) 
DELETE (<File = employee> and 
<employee.key<person.key = *>>) 
DELETE (<File = person> and <person.key = *>) 


Figure 6.4 Example of DESTROY Statement 


Ge THE FOR EACH LOOP 

The FOR EACH statement is a loop to retrieve data from the database. The 
FOR EACH: loop is shown in Figure 6.5. The FOR EACH loop performs 
instructions for each member in a set of database values given in set expression. 
The loop label functions as a "goto" instruction to exit from the loop body when 
certain conditions are satisfied. 

The loop parameter is a variable for an attribute of the set of database 
values. It represents the same attribute for the life of the loop body. The 
set expression provides the set of actual values for the loop body. 

In the remainder of this section examples from the Daplex manual are used 


to demonstrate Daplex to ABDL translations (Figure 6.6). For all of the 


[loop label:} FOR [EACH] loop parameter IN set_ expression 
[WHERE boolean expression] [BY order clause| 
[LOOP] 
loop_body 
ENE LOO P|; 


Figure 6.5 The FOR EACH Loop 
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following examples, the RETRIEVE in the data manipulation language of ABDL 
provides the same function that the Daplex PRINT command does. The 
RETRIEVE executes in the following steps. First, the KC of MLDS gets a buffer 
for a particular Daplex function or attribute, such as person.key. It then takes 
the value of person.key from that buffer, which is actually a temporary file, and 
uses it to get the name from the person file. Finally, it prints the name. The 


function in the target list of the RETRIEVE, e.g., name, is the value printed. 


Example C.1 
FOR EACH s IN student 
LOOP 
PRINT (name(s)); 
END LOOP, 


The following is the execution algorithm: 
RETRIEVE (File = student) (person.key) 


for each person.key 
RETRIEVE ((File = person) and (person.key = ***)) 


(name) 
Example C.2 
FOR EACH s IN student BY ASCENDING name(s) 
LOOP 


PRINT (name(s)): 
PRINT (name(advisor(s)); 
END LOOP; 
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The following is the execution algorithm: 


RETRIEVE (File = student)(person.key) 

for each person.key 

RETRIEVE ((File = person) and (person.key = **)) 

(name,person.key) BY name 

for each person.key 

RETRIEVE ((File = student) and (person.key = **)) 
(advisor) 

for each person.key 

RETRIEVE ((File = person) and (person.key = **)) 

(name) 


Example C.3 
FOR EACH e IN enrollments(student) 
WHERE credits (class(e)) > 3 
BY semester (class(e)) 
LOOP 
PRINT (title(class(e))); 
END LOOP; 


The following is the execution algorithm: 


RETRIEVE (File = student) (enrollments) 
/* enrollments -> enrollment.key */ 
for each enrolliment.key 
RETRIEVE ((File = enrollment) and 
(enrollment.key = **))(class) 
for each course.key 
RETRIEVE ((File = course) and 
(course.key = **) and (credits > 3)) 
(title) BY semester 


Example C.4 


FOR EACH u IN undergraduate 
WHERE COUNT (enrollments(u)) > 4 AND 
(gpa (u) < 2.5 OR FOR SOME e in 
enrollments(u) : (grade (e) < 1.5)) 
LOOP 
PRINT (name(u)); 
END LOOP; 


The following is the execution algorithm: 
/* Note: cannot include aggregate operators 
/* as one of RETRIEVE arguments. May only 
/* include in target list. */ 


RETRIEVE (File = undergraduate) (person.key) 
for each person.key 
RETRIEVE ((File = student) and (person.key = **)) 
(enrollments, COUN T(enrollments).person.key ) 
if COUNT > 4 
for each person.key 
RETRIEVE ((File = undergraduate) and (gpa < 2.5)) 
(person.key) BY (person.key) 
/* BY sorts by person.key and 
/* eliminates duplicates */ 


for each enrollment.key 
RETRIEVE ((File = enrollments) 
and (enrollment.key = **) 
and ( grade < 1.5))(enrollment.key) 
for each enrollment.key 
RETRIEVE ((File=student) and (enrollment.key=**)) 
(person.key) BY (person.key) 
/* Check and discard duplicates from student 
/* retrieve. 
/* Then compare both person buffers and eliminate 
/* duplicates by merging into one new person file. 
/* Finally retrieve names from person file. */ 


RETRIEVE ((File = person) and (person.key = **)) 


(name) 


/* For the general case will have to specify 

/* two or more pointers to buffers of the same 
/* named target list. Number of pointers will 
/* depend upon number of "OR’s". 


Example C.5 
FOR EACH s IN student BY name(s) 
LOOP 
PRINT (name(s)): 
FOR EACH e IN enrollments(s) 
WHERE credits (class(e)) > 3 
/* where’s implemented as arguments except after 
[a TOR so’. | 
BY semester (class(e)) 
DESCENDING credits(class(e)) 
LOOP 
PRINT (title(class(e))); 
END LOOP; 
Puy DOO. 


The following is the execution algorithm: 


RETRIEVE (File = student) (person.key) 
for each person.key 
RETRIEVE ((File = person) and (person.key = *)) 
(name.person.key) BY name 
for each person.key 
RETRIEVE ((File = student) and (person.key = *)) 
(enrollments) 
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for each enrollment.key 
RETRIEVE ((File = enrollments) and 
(enrollment.key = **)) (class) 
for each course.key 
RETRIEVE ((File = course) and (course.key = *) 
and (credits > 3)) 
(title) BY (semester, INVERT (credits)) 


Example C.6 
FOR EACH c IN course 
BOOP 
FOR EACH d IN department 
WHERE name(d) = name(dept(c)) and 
head(d) != NULL 
LOOP 
PRINT (name(head(d))); 
PRINT (title(c)); 
ERP ELOOR, 
END LOOP; 


The following is the execution algorithm: 


RETRIEVE (File = course) (course.key dept) 
for each course.key 
for each department.key 
RETRIEVE ((File = dept) and (department.key = **) 
and (head != 0)) 
(department.key.head) 
for each person.key 
RETRIEVE ((File = person) and (person.key = **)) 
(name) 
for each department.key 
RETRIEVE ({File = course) and (dept = department.key) 
(title) 


Example C.7 
FOR EACH s IN student WHERE advisor(s) != NULL 
LOOr 
FOR EACH ¢ IN enrollments(s) 
LOOP 
PR name(s): 
Picea (title(class(c))); 
DIOL 
FOR EACH t IN teaching (advisor(s)) 
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LOOP 
PRINT (name(advisor(s)), title(t)); 
END LOOP: 
END LOOP; 


The following is the execution algorithm: 


RETRIEVE ((File = student) and (advisor != 0)) 
(person. key, advisor) 
for each person.key 
RETRIEVE ((File = person) and (person.key = **)) 
(name) 
RETRIEVE ((File = student) and (person.key = **)) 
(enrollments) 
RETRIEVE ((File = enrollments) and 
(enrollment.key = **)) (class) 
for each course.key 
RETRIEVE ((File = course) and (course.key = **))(title) 
RETRIEVE ((File = person) and (advisor = **))(name) 
RETRIEVE ((File = faculty) and 
(advisor(person.key) = **)) (teaching) 
for each course.key 


RETRIEVE ((File = course) and (course.key = **)) (title) 


Figure 6.6 Examples Of FOR EACH Loop And Their ABDL 


Translations 


A mapping algorithm is presented in Figure C.3 of Appendix C for the FOR 
EACH loop. For any loop parameter, such as in Example C.3 where e represents 
the enrollments, the algorithm first retrieves a set of all possible database values. 
Then, on the basis of valnce received and the function declared (in the loop body 
or in the set expression), the algorithm proceeds to translate the instructions in 
the loop body. The loop body instructions are represented recursively inside the 
square brackets in the mapping algorithm. 

ABDL automatically returns values in ascending order. The order clause 
DESCENDING in Example C.5 is not necessary in the mapping algorithm. The 
KC in the language interface will translate DESCENDING by returning values in 


reverse order. The "BY" instruction in ABDL is the same as in Daplex. 
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De THE ASSIGNMENT STATEMENT 

The ASSIGNMENT statement is used to modify function values in the 
database. The ASSIGNMENT statement is very close to the LOOP examples. 
In fact, except for using the assignment staternent in the body of the LOOP. 
their forms are identical. The assignment statement provides a capability to 
assign or change a value of a _ single valued function expression. A 
single valued function expression is an attribute (function) with only one distinct 
value, i.e., it is not a set-valued function. For example, if an employee's salary is 
increased, ASSIGN (:=) will permit the change. Similarly, if a student’s gpa 
needs to be changed, ASSIGN makes the change. 

The format for an ASSIGN is as follows: 


[loop label:| FOR [EACH] loop parameter 
IN set expression 
[WHERE boolean expression] [BY order clauses| 
LOOP 


single valued function expression := expression; 
END LOOP; 


Examples with comments are provided in Figure 6.7. The mapping 


algorithm is shown in Figure C.4 of Appendix C. 
Example D.1 


FOR EACH s IN student 
WHERE name(s) = "Thomas Jefferson" 


LOOP 
gpa(s) := 3.7; 
END LOOP: 


The following is the execution algorithm: 
/* example of descendant */ 


RETRIEVE ((File = person) and 
(name = "Thomas Jefferson")) (person.key) 
for each (person.key) 
/* find function(s) in appropriate file */ 
UPDATE ((File = undergraduate) and 


VO 


(undergraduate.key(student.key(person.key = *)))) 
(gpa = 3.7) 


Example D.2 


FOR EACH u IN undergraduate WHERE name(s) = "cow" 
LOOP 
advisor(u) := 
{f IN faculty WHERE name(f) = "Mary Jones"}; 
END LOOP: 


The following is the execution algorithm: 
/* example of ancestor */ 


RETRIEVE ((FILE = person) and (name = "cow")) 
(person.key) 
for each (person.key = *) 
RETRIEVE (File = undergraduate) and 
(undergraduate.key(student.key(person.key = *)))) 
(undergraduate.key(student.key (person.key))) 
/* find faculty member */ 
RETRIEVE ((File = person) and (name = "Mary Jones")) 
(person.key) 
for each (person.key = **))) 
RETRIEVE ((File = faculty) and 
(faculty.key(employee.key(person.key = **)))) 
(faculty.key(employee.key (person.key))) 
/* find update function(s) in appropriate file */ 
UPDATE ((File = student) and 
(student.key(person.key = *)) (advisor = **)) 


Example D.3 


/* if following were in above loop error would occur 
since function was not declared with null */ 


major(u) := NULL; 


Example D.4 
gpa(u) := avg (grade DUPLICATES (enrollments(u))); 


find enrollments in file = student 


go to file = enrollment and for all grades for which 
u is enrolled get the average using aggregate 
operation. 


Example D.5 


FOR EACH d IN department WHERE dname(d) = "CS" 
LOOP 
head(d) := 
{u IN undergraduate WHERE name(u) = "moose"}; 
END LOOP; 


This assignment will cause an error, since the head 
function is declared as a faculty type and the 
undergraduate type does not 

overlap with the faculty type. 


Figure 6.7 Examples of The ASSIGNMENT Statement 


E. THE INCLUDE STATEMENT 

The INCLUDE statement is used to add attribute values to set-valued 
functions (attributes). As shown in the explanatory examples, each of the 
INCLUDE statements requires the FOR EACH loop to first find the file. The 
corresponding ABDL statements are a sequence of RETRIEVE operations (for 
the FOR EACH) followed by one or more INSERT operations (for the 
PENCIL DE). 

The format of the INCLUDE statement is: 


[loop label:| FOR [EACH] loop parameter 
IN set expression 
[WHERE boolean _ expression] [BY order clauses] 
LOOP 
INCLUDE expression 
INTO set valued function expression; 
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END LOOP; 
An expression 1s any string, scalar or entity-valued expression that yields a single 
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value or a set of values. A set valued function expression is any function 
expression where the function name has been declared to be set valued and the 
function argument is a single valued entity expression, an 
entity valued loop parameter or a_ single entity valued function expression. 
The type of the expression values must be the same as the type of the function 
name. An error also occurs if any value falls outside of the declared legal range 
of values for the function. 

Examples with comments are provided in Figure 6.8. The general mapping 


algorithm is shown in Figure C.5 of Appendix C. 


Example E.1 


FOR EACH s IN student WHERE name(s) = "Thomas Jefferson" 
LOOP 
INCLUDE {c IN course WHERE title(c) = "History 1"} 
INTO enrollments(s); 

END LOOP; 

The expression { c IN course WHERE title(c) = "History 1"} is an example 
of an entity-valued expression that defines a set of course entities that can be 
treated as a unit. All course entities titled "History 1" would be included in the 
enrollments set for each student named Thomas Jefferson. EXCLUDE is the 
corresponding statement used to remove values from a set. 

Enrollments is an example of a set valued function expression. So if s is a 
loop parameter which ranges over student, this adds the course "History 1" to 
the set of enrollments of s. Or s could represent one student only depending upon 
the conditions in WHERE of FOR EACH. In the example above , if "Thomas 
Jefferson" is a unique name, then only s for "Thomas Jefferson" would have 


"History 1" added to his set of enrollments in his student record. 


The following is the execution algorithm: 


RETRIEVE (<File = person> and 
<name = "Thomas Jefferson">) (person.key) 
RETRIEVE (<File = course> and <title = "History 1">) 
(course.key) 
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for each (person.key) 
/* check if student - if not error determined by KC*/ 
RETRIEVE (<File =student> and 
<student.key<person.key = *>>) (all) 
INSERT (<File,enrollment>, 
<enrollment.key,**>,<class,<course.key,** >>, 
<grade,0.0>) 
/* new enrollment.key only */ 
INSERT (<File,student>, <student.key<person.key,*>>, 
adviser, ** S.aqmajon a>. 
<enrollments,<enrollment.key ,**>>) 


Example E.2 


FOR EACH f IN faculty WHERE name(f) = "Cow" 
LOOT 
INCLUDE {c IN course WHERE title(c) = "DBMS" 
AND dname(dept(c)) = "EECS"} 
INTO teaching(f); 
END LOOP; 


The following is the execution algorithm: 


RETRIEVE (<File = person> and <name = "Cow">) 
(person.key) : 

RETRIEVE (<File = department> and <dname = "EECS">) 

| (department.key) 

RETRIEVE (<File = course> and <title = "DBMS"> and 

<dept = **>) (course.key) 
for each (person.key) 

RETRIEVE (<File = faculty> and 
<faculty.key <employee.key<person.key = *>>>) 

(all) 
/* new teaching function only */ 

INSERT (<File,faculty>, 
<faculty.key,<employee.key,<person.key.* >>>, 
<rank name.****>,<teaching.<course.key,***>>, 
Saienure, Se depuy =e >) 


Example E.3 
FOR EACH e IN employee WHERE name(e) = "Bug" 


LOOP 
INCLUDE "4848151" 


INTO phones(e) 
END LOOP 


The following is the execution algorithm: 


RETRIEVE (<File = person> and <name = "Bug">) 
(person.key) 
for each (person.key) 
RETRIEVE (<File = employee> and 
<employee.key<person.key = *>>) (all) 
/* new phones function only */ 

INSERT (<File,employee>,<employee.key <person.key,*>>, 
<hOmema@ dress \e eee Oice, hee, 
<phones,"4848151">, <salary, AXXX>, 
<dependents,XXXX>) 


Figure 6.8 Examples of INCLUDE Statement 


F. THE EXCLUDE STATEMENT 

The EXCLUDE statement is the opposite of the INCLUDE statement. Its 
purpose is to remove a value or group of values from a set valued function. As 
shown in the explanatory examples, each of the EXCLUDE statements also 
requires the FOR EACH loop to first find the file. The corresponding ABDL 
statements are a sequence of RETRIEVE operations (for the FOR EACH) 
followed by a DELETE operation (for the EXCLUDE). The DELETE statement 
has the capability to undo the ABDL INSERT statement. completely or 
partially. Also, the DELETE may not remove anything from the database if 
expression, in the general format of EXCLUDE below, specifies something not in 
the database, i.e., an incorrect phone number to be excluded. The INSERT 
statement has been discussed in its use in translating the Daplex CREATE 
statement and the DELETE statement has been introduced in the DESTROY 
statement. 

To permit partial deletion, records (entities) that contain set-function 
expressions will be initialized with the null value. This will permit exclusion of 
individual items from sets of values. 


The general format of the EXCLUDE statement is: 
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EXCLUDE expression FROM set valued function expression 


Expression and set valued function expression are as previously described in the 
INCLUDE statement. Examples with comments are provided in Figure 6.9. The 
general mapping algorithm is shown in Figure C.6 of Appendix C. 


Example F.1 


FOR EACH s IN student WHERE name(s)="Thomas Jefferson" 
LOOP 
EXCLUDE {c IN course WHERE title(c) = "History 1"} 
FROM enrollments(s); 
END LOOP; 


The following is the execution algorithm: 


RETRIEVE (<File = person> and 
<name = "Thomas Jefferson">) 
(person.key) 
for each (person.key) 
RETRIEVE (<File = student> and 
<student.key<person.key = *>>) 
(enrollment.key) 
for each (enrollment.key) 
RETRIEVE (<File = course> and <title = "History 1">) 
(course.key) 
DELETE (<File = enrollment> and <enrollment.key = **> 
and <class<course.key = ***>>) 
DELETE (<File = student> and 
<student.key<person.key = *>> 
and <enrollments<enrollment.key = **>>) 


Example F.2 


FOR EACH f IN faculty WHERE name(f) = "Cow" 
KOOP 
EXCLUDE {c IN course WHERE title(c) = "DBMS" 
AND dnaine(dept(c)) = "EECS"} 
FROM teaching(f); 
END LOOP: 
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The following 1s the execution algorithm: 


RETRIEVE (<File = person> and <name "Cow">) 
(person.key) 
RETRIEVE (<File = department> and <dname = "EECS">) 
(department.key) 
RETRIEVE (<File = course> and <title = "DBMS"> and 
<dept<department.key = **>>) (course.key) 
for each (person.key) 

DELETE (<File = faculty> and 
<faculty.key<person.key = *>> and 
<teaching<course.key = ***>>) 


Example F.3 


FOR EACH e IN employee WHERE name(e) = "Bug" 
LOOP 

EXCLUDE "4848151" 

FROM phones(e); 
END LOOP; 


The following is the execution algorithm 


RETRIEVE (<File = person> and <name = "Bug">) 
(person. key) 
for each (person.key) 
DELETE (<File = employee> and 
<employee.key<person.key = *>> and 
<phones = "4848151">) 


Example F.4 


FOR EACH e IN employee WHERE name(e) = "Bug" 
EOOR 
EXCLUDE phones(s) 
FROM phones(s); 
END LOOP; 


The following is the execution algorithm; 


RETRIEVE (<File = person> and <name = "Bug">) 
(person. key) 
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for each (person.key) 
DELETE (<File = employee> and 
<employee.key<person.key = *>> and 
<phones != 0>) 


Figure 6.9 Examples of EXCLUDE Statement 


G. THE MOVE STATEMENT 

The MOVE statement moves a record (i.e., an entity) or a group of records 
with the same value conditions in a subtype from one set of subtype files into 
another set of subtype files. All corresponding descendant subtypes of the record 
are also automatically moved. The subtype file into which the record is to be 
added must be a terminal type. When the record is added to the terminal type. 
it is automatically added to the ancestors of this terminal type of which part of 
the record does not already belong. 


The general format of the MOVE statement is: 


MOVE entity valued expression 
[FROM entity type names] 
[INTO entity type names] 

[(function name 1 => expression 1, 
function name 2 => expression 2 


= 
function name m => expression m)|; 


where entity valued expression is : 
- an entity valued loop parameter, 
-a single entity valued function expression. 
- or an entity valued set containing one member. 


where FROM entity type names are: 
- a list of one or more previously declared 


entity subtypes of the same base type. 


where INTO entity type names are: 
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- a list of one or more overlapping terminal 
subtypes of the same base type. 


constraints: all four are parsing aborts: 

- An entity cannot be removed from or added 
into a base type. 

- If removing a subtype and then referencing a 
function corresponding to that subtype, 
the move is aborted. 

- If MOVE invalidates null value constraint, 
uniqueness constraint, or an overlap 
constraint, abort MOVE. 

- If MOVE conflicts with requirement that an 
entity of a given supertype must belong to at 
least one of its subtypes, abort move. 


As in the previous sections a mapping algorithm is presented in Figure C.7 of 
Appendix C. First, the algorithm checks for all possible errors. If there are no 
errors, the algorithm proceeds to retrieve record subtypes until it has found the 
record or records with the necessary conditions for removal. It then removes 
these records and associated records from all descendants of this record subtype. 
Finally, it locates the terminal subtype to which the record should be added. It 
adds it to that terminal subtype and to all ancestors of that terminal subtype to 
which the record does not belong. 

The corresponding ABDL statements needed to transform the Daplex move 
statement are: RETRIEVE, DELETE, and INSERT. The RETRIEVE finds the 
record, the DELETE removes it from its subtype and subtype descendants, and 
the INSERT creates the record in the terminal type and the appropriate terminal 
type ancestors. 

Examples are provided with comments as before in Figure 6.10 to clarify the 


Move statement. 
Example G.1 


MOVE {g IN graduate WHERE ssn(g) = "556667777"} 
FROM student INTO faculty 
(home address => "789 Cambridge St, Boston", 
office ee 
salary => 25000.00, jet PLONE 


dependents => 0, i 
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rank => assistant, 
dept => {d IN department WHERE /* FACULTY 
dname(d) = "EECS"});  ******#*/ 


The following is the execution algorithm: 


RETRIEVE (File = graduate) 
(graduate.key(student.key(person.key)) } 
until 
RETRIEVE((File = person) and (person.key = *) 
and (ssn = "556667777"))(person.key) 
/* graduate is student is person */ 
DELETE ((File = graduate) and 
(graduate.key(student.key(person.key = **)))) 
DELETE ((File = student) and 
(student.key(person.key = **))) 
/* faculty is employee is person */ 
INSERT (<File, employee>, 
<employee.key,<person.key, **>>, 
<home address, "789 Cambridge St, Boston">, 
<office, "218">, <phones,0>, 
<salary, 25000.00>, <dependents, 0>) 
RETRIEVE ((File = department) and (dname = "EECS")) 
(department.key) 
INSERT (<File, faculty>, 
<faculty.key,<employee.key, <person.key, **>>>, 
<rank, <rank name, assistant>>, 
<teaching,<course.key,0>>, 
<tenure, false>, <dept,<department.key, ***>>) 


Example G.2 


FOR EACH u IN undergraduate 
WHERE year(u) = 4 AND gpa(u) > 2.5 
LOOP 
MOVE u FROM undergraduate INTO graduate; 
ENP OOP: 


The following is the execution algorithm: 


RETRIEVE ((File = undergraduate) and (year = 4) and 


(gpa > 2.5)) 
(undergraduate.key (student .key(person.key))) 


ie undergraduate is student di 
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/* graduate is student */ 
for each (undergraduate(student.key(person.key))) 
DELETE ((File = undergraduate) and 

(undergraduate. key(student.key(person.key = *)) 

INSERT (<File, graduate>, 

<graduate.key,<student.key,<person.key, *>>>, 

<advisory committee, <employee.key, 

<person.key, 0>>>) 

/* null = 0 */ 


Example G.3 


FOR EACH u IN undergraduate 
WHERE year(u) = 4 AND gpa(u) > 2.5 
LOOP 
MOVE u FROM student INTO graduate 
(major => {d IN department WHERE 
dname(d) = "FRENCH"}: 
END LOOP; 


The following is the execution algorithm: 


RETRIEVE ((File = undergraduate) and (year = 4) and 
(gpa > 2.5)) 
(undergraduate.key (student .key(person.key )) ) 


/* to simulate FROM */ 
for each (undergraduate.key (student. key(person.key))) 


DELETE ((File=student) and (student.key(person.key=**)) 
DELETE ((File=undergraduate) and 
(undergraduate.key(student.key (person.key=**)))) 


/* to simulate INTO */ 
RETRIEVE ((File=department) and (dname="FRENCH")) 
(department.key) 

for each (graduate.key(student.key(person.key))) 

INSERT (<File,student>,<student.key,<person.key,**>>, 
<advisor.faculty.key, 
<employee.key.<person.key,0>>>, 
<major,<department.key,***>>, 
<enrollments,<enrollment.key .0>> 

INSERT (<File.graduate>. 
<graduate.key,<student.key,<person.key,** >>>. 
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<advisory committee, 
<faculty.key,<employee.key,<person.key,0>>>>) 


Example G.4 


MOVE {u IN undergraduate WHERE ssn(u) = "556667777" 
INTO faculty 
(home_ address => "456 Inman St, Cambridge", 


3) 


illegal move statement; will abort because leaving 
undergraduate record violates overlap 
constraint between subtypes. 


Example G.5 


MOVE {g IN graduate WHERE ssn(g) = "556667777"} 
FROM graduate INTO faculty 
(home_ address => "890 Charles St, Boston", 


ey 


illegal move statement; case of dangling subtype - 
entity of given supertype must belong to one 
of its subtypes, i.e., student must be 
undergraduate or graduate. 


Example G.6 


MOVE {u IN undergraduate WHERE ssn(u) = "555667777"} 
FROM undergraduate INTO faculty, graduate 
(home address => "890 Charles St., Boston", 
office =e il) 


salary => 5500.0, 

dependents => 0, 

rank «=> assistant, 

dept => {d IN department WHERE 


name(d) = "EECS"}); 
The following is the execution algorithm: 


RETRIEVE (File = undergraduate) 
(undergraduate.key (student. key (person.key))) 
until 
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RETRIEVE ((File = person) and (person.key = *) and 
(ssn = "555667777"))(person.key) 
/* simulate FROM */ 
DELETE ((File = undergraduate) and 
(undergraduate.key(student.key(person.key = *)))) 
/* still have student supertype */ 


/* faculty is employee is person */ 

INSERT (<File, employee> ,<employee.key,<person.key, **>>, 
<home_ address, "890 Charles St, Boston">, 
<office, "271">, <phones,0>, <salary, 5500.00>, 
<dependents, 0>) 

RETRIEVE ((File = department) and (dname = "EECS")) 

(department.key) 

INSERT (<File, faculty>, 
<faculty.key,<employee.key, <person.key, **>>>, 
<rank, <rank name, assistant>>, 
<teaching,<course.key O>>, 
<tenure, false>, <dept,<department.key, ***>>) 


INSERT (<File, graduate>, 
<graduate.key,<student.key.<person.key, *>>>, 
<advisory committee, 
<faculty.key,<employee.key,<person.key, 0> >>>) 

Figure 6.10 Examples of the MOVE Statement 
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VIL. IMPLEMENTATION SPECIFICATIONS 


This chapter attempts to "bridge the gap" between the design specifications 
and the implementation specifications for transforming and controlling the 
execution of Daplex data structures into their equivalent ABDL data structures 
and translating and controlling the execution of Daplex requests into their 
equivalent ABDL requests. In Chapter 2 we have briefly presented the MLDS 
modules required to transfer and translate Daplex data definitions and 
transactions into equivalent ABDL data definitions and transactions. In Chapter 
4 we have provided a rigorous specification for a Daplex database and structured 
the Daplex data in a format recognizable by LIL of MLDS. In this chapter, we 
first describe a proposed implementation specification for the transformation and 
translation of Daplex data definition schema and requests into equivalent ABIDL 
data definition schema and requests using KMS of MLDS. We then describe a 
proposed implementation specification for the contro of the ABDL data 
definitions and requests in MBDS using KC of MLDS. It should be noted that 
the proposals for KMS and KC in this chapter are not meant to be rigorous 
implementation specifications. Rather they are meant to pass our understanding 
of the operations and functions of the Daplex database system onward for 


implementation. 


A. PARSING AND TRANSLATING BY KMS 

This section proposes specifications for an implementation of the 
transformation of Chapter 5 and the translation of Chapter 6. The proposal 
parallels a method used in all previous implementations [Ref. 8, 9. 10]. The 
interested reader is referred to these References for details of the {LDS modules. 
As mentioned in Chapters 2, 5, and 6, KMS of MLDS is the module responsible 


for transforming and translating Daplex requests into their equivalent ABDL 


requests. The KMS functions include: 
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(1) parsing the request to validate the user’s Daplex syntax, and 
(2) translating the request into an equivalent ABDL request. 


To propose an implementation specification for KMS we utilized the 
Adaplex grammar [Ref. 22] in the Backus-Naur Form (BNF) of CCA. The 
Adaplex BNF grammar is then reduced to a pure Daplex BNF grammar by 
eliminating all system-dependent and Ada-related goals and rules. A grammar is 
a set of goals and rules governing the syntax of a language. The Daplex BNF 
grammar accomplishes the first function of KMS. The second function of KMS is 
proposed by the interleaving pseudocode in the Daplex BNF grammar in the 
recursive-descent format. Pseudocode is short, concise, English language phrases 
that are structured by key words such as If-Then-Else, While-Do, and End. Key 
words and indentation describe the flow of control, while the English phrases 
describe processing actions [Ref. 1]. The recursive-descent format implies that the 
pseudocode is recursively defined in the descending order in which the rules are 
executed in the grammar in order to satisfy the goals of the grammar. 

The Daplex BNF grammar is further subdivided into its DDL and DML 
sections to match the requirements for transforming and translating Daplex 
requests into their equivalent ABDL requests. The informal pseudocode is well 
over 2000 lines. Because of its length it was decided not to include it in this 
thesis. It is available upon request. Since its purpose is to provide a specification 
for actual implementation, which is 5 or 6 times longer than the pseudocode, it is 
recommended that the reader review the actual implementation of Daplex to 


ABDL |[Ref. 11] for the final implementation decisions. 


B. CONTROLLING BY KC 

This section provides an overview of the control for executing ABDL 
requests by MBDS once KMS has performed the necessary transformations and 
translations of Daplex requests into their equivalent ABDL requests. KC 
controls the submission of the ABDL requests to MBDS for processing. 

For ABDL requests that involve inserting information to create a new 
database, or the use of intermediate retrieval requests to insert. delete, or update 


information in an existing database, control is returned to LIL after MBDS 
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processes the transaction. Informally, intermediate retrieval requests are those 
requests which retrieve data values from data files which are used in other data 
files to retrieve data values that are needed to satisfy a transaction request. For 
ABDL requests involving final retrieval requests, KC sends the ABDL request to 
MBDS, receives the results back from MBDS, loads the results into a buffer. and 
calls KFS to format the results one buffer at a time. Informally, a final retrieval 
request is the last request needed to retrieve a data value to satisfy a transaction 
request. After the last buffer is processed by KFS the resulting table is displayed 
and control returns to LIL. 

Examples of intermediate requests are provided in Section C of Chapter 6. 
Intermediate retrieval requests are sent to KC by KM5 as request templates. <A 
request template is an ABDL retrieve request with one unspecified attribute 
value. KC must use the results obtained from the previous ABDL retrieve 
request (i.e., target attribute values) and the request template to build the next 
ABDL request, i.e.. KC substitutes the target attribute values for the unspecified 
attribute value in the request template. The processing of intermediate retrieval 
requests is managed by KC. 

Just as data structures are specified for LIL in Chapter 4, data structures 
are specified for KC to recognize the different types of ABDL requests. A KC 
procedure then uses instructions for each separate ABDL request (i.e.. INSERT, 
DELETE, UPDATE, the intermediate RETRIEVE, and the final RETRIEVE) 
to control the processing of the ABDL requests. References are again made to all 
the implementation theses for details of the required data structures and control 
procedures. <A particular emphasis is directed to Reference 9 where the 
implementation of KC and the ABDL example are identical to that required for 


Daplex. 
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VII. RESULTS AND CONCLUSIONS 


As stated in Chapter 1, by using an unconventional approach to the design 
and implementation of a basic database system, we can have a system supporting 
multiple data models as if the system is a heterogeneous collection of database 
systems. Our unconventional approach is geared to flexibility, efficiency, and 
extensibility, which makes it an attractive alternative to conventional 
approaches. By developing multiple data language interfaces we offer users our 
approach without incurring any retraining costs. In adopting our system, users 
appear to have their same old database system, but one that works faster and has 
other data languages. | 

In this theses we have presented a methodology for supporting entity- 
relationship database management on an attribute-based database system. 
Specifically, we have provided the design and detailed data structures required to 
recognize Daplex requests by MLDS in Chapter 4. We have described the 
transformations required from Daplex data definitions in the Daplex schema into 
ABDL definitions in Chapter 5; and provided a proposed implementation 
specification for the DDL transformations in Chapter 7. We have also described 
the translations required from Daplex transactions into their corresponding 
ABDL translations in Chapter 6; and provided a proposed implementation 
specification for the DML translation in Chapter 7. Finally, we provided an 
overview of the contro! procedures required for processing the equivalent ABDL 
requests in Chapter 7. 

The entity-relationship interface can be implemented on the basis of the work 
we have presented herein and of the work we have accomplished to date [Ref. 8, 
9, 10]. The implementation of the entity-relationship interface using MLDS will 
be the high point in the study of the Multi-Backend and Multi-lingual Database 
System. 
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APPENDIX A: THE UNIVERSITY DATABASE SCHEMA 





DATABASE university IS 


TYPE person; 

SUBTYPE employee; 

SUBTYPE support staff; 

SUBLYPE faculty: 

SUBTYPE student: 

SUBYPE graduate; 

SUBTYPE undergraduate; 

TYPE course; 

TYPE department; 

TYPE enrollment; 

TYPE rank name IS (assistant, associate, full); 
TYPE semester name IS (fall, spring, summer); 
TYPE grade point IS FLOAT RANGE 0.0 .. 4.0; 


TYPE person IS 
ENTITY 
name : STRING (1 .. 25); 
ssn : STRING (1 .. 9) := "000000000"; 
EDEN aE Y; 


SUBTYPE employee IS person 


ENR Y 
home address : STRING (1 .. 50); 
office : STRUNG ale Si 
phones 2 SE OPS RING (lo. 4): 
salary : FLOAT; 


dependents : INTEGER RANGE 0O.. 10; 
END ENTITY; 


SUBTYPE support staff IS employee 
ENTITY 
supervisor : employee WITHNULL; 
full time : BOOLEAN; 
END ENTITY; 


SUBTYPE faeulty IS employee 
ENTITY 
rank : rank name; 
teaching ; S&T OF course; 


tenure : BOOLEAN := FALSE; 
dept : department; 
END ENDIG. 


SUBTYPE student IS person 


ENTITY 
advisor : faculty WITHNULL; 
major : department; 


enrollments : SET OF enrollment; 


END ENG 


SUBTYPE graduate IS student 
ENTITY 
advisory committee ; SET OF faculty; 
END ENGit y 


SUBTYPE undergraduate IS student 
BN 
gpa : grade point := 0.0; 
year : INTEGER RANGE 1 .. 4 := 1; 
BEND ENTITY; 


TYPE course IS 


lela ys 
title :< SPRING (io. 
dept : department; 


semester : semester name; 
credits : INTEGER; 
ENP DEEN Pity. 


TYPE department IS 
EN a. 
name 25 RING (1e20ie 
head : faculty WITHNULL; 
END ENTITY; 


TYPE enrollment IS 
EN Prey 
class : course; 
grade : grade point; 
END EN 


UNIQUE ssn WITHIN person; 


UNIQUE name WITHIN department; 
UNIQUE title, semester WITHIN course; 


74 


OVERLAP graduate WITH faculty; 
END university; 


Figure A.1 University Database Schema 
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Figure A.2 Logical Graphical Representation of University Database Schema 
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Figure A.3 Generalization Hierarchy of Person for the University Database Schema 
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APPENDIX B: DAPLEX DATA STRUCTURES 


struct function node 
/* structure definition for each function type declaration */ 


char fn name{ENLength+1]; 


char _ fn_type; /* either f(loat), i(nteger), s(tring), 
b(oolean), or e(numeration) */ 

int fn_range; /* Boolean if range of values */ 

int fn_total_ length; /* max length */ 

int fn_ num value; /* number of actual values */ 


struct ent _value *fn_ value; /* actual value */ 

struct ent “node *fn _entptr; /* ptr to entity type */ 

struct gens sub _node ‘fn _subptr: /* ptr to entity subtype */ 

struct ent non “node *fn_nonentptr: /* ptr to nonentity type */ 
struct sub 1 _non node *fn_nonsubptr: /* ptr to nonentity subtype */ 
struct der non “node *fn_nonderptr: /* ptr to nonentity dertype */ 
int fn_entnull; /* initialized false set true for no value */ 
int fn_unique; /* init false - unique if true */ 

struct function node *fn_ next fntptr; 


je 


struct overlap sub node /* list of ponters */ 
/* structure definition for terminal subtypes that define one or more 
subtypes */ 


struct gen sub node *“osn_name: /* only terminal subtypes */ 
struct overlap sub node ‘*osn next name; 


ie 


struct overlap ent node  /* list of pointers */ 
/* structure definition for subtypes with one or more entity supertypes */ 


{ 


struct ent node *oen name; 
struct overlap ent node *oen next name; 


ie 


struct gen sub node 
/* structure def for each generalization (supertype/subtype) node */ 


char  gsn name{ENLength + 1); 


int gsn num funct: /* number of assoc. functions* / 

int gsn_ terminal;  /* if true (=1) it is terminal type */ 

struct overlap ent node *gsn entptr; /* ptr to entity supertype */ 
int gsn num _ent; /* number of entity supertypes */ 

struct function node *gsn_ ftnptr: 


struct overlap sub node *gsn subptr; /* ptr to subtype supertype */ 
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int gsn num sub; /* number of subtype supertypes ”/ 
struct gen sub node “gsn_ next genptr; 


HE 


struct ent node 
/* structure definition for each entity node */ 


{ 


char en name|ENLength + 1); 


int en num funct; /* number of assoc. functions * / 
int en_terminal, /* if true (=1) it is terminal type */ 
struct function node *en ftnptr; 

struct ent node *en next ent; 


ie 


struct ent value 
/* struct def for value of ’i’,’s’,f,’e’, or *b’ */ 
char ‘*ev_ value; /* pointer to character string only */ 
struct ent value *ev_next value; 


ye 


struct ent_non_ node 
/* structure def for each base-type nonentity node */ 


{ 


char enn name{ENLength + 1); 


char enn type; /* either i(nteger), s(tring), 
f(loat), e(numeration), or b(oolean) */ 
int enn total length; /* max length base-type value */ 
/* of one string in ent value */ 
int enn range; /* true or false depending on 


whether there is a range. If a 
range exists, there must be two 
entries into ent value */ 


int enn num _ values; /* number of actual values */ 
struct ent value *enn value; /* actual value of base-type */ 
int enn_constant; /* boolean to refelect constant value */ 


struct ent non node ‘*enn next node; 


}; 


struct sub non node 
/* structure def for each subtype nonentity node */ 


{ 


char snn_ name({ENLength + 1]; 
char snn_ type; /* either i(nteger). s(tring). 
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f(loat), e(numeration), or b(oolean) */ 

int snn_ total length; /* max length of subtype value */ 
int snn_range; /* true or false depending on 

whether there is a range. If a 

range exists, there must be two 

entries into ent value */ 
int snn num values; /* number of actual values */ 
struct ent value *snn_ value; /* actual value of subtype */ 
struct sub non node “*snn_next_node; 


ie 


struct der non node 
/* structure def for each derived t: pe nonentity node */ 


{ 


char dnn name{ENLength + 1); 


char dnn_type; /* either i(nteger), s(tring), 

f(loat), e(numeration), or b(oolean) */ 
int dnn_ total length; /* max length of derived type value */ 
int dnn_range; /* true or false depending on 


whether there is a range. If a 

range exists, there must be two 

entries into ent value */ 
int dnn num values; /* number of actual values */ 
struct ent value *“dnn value; /* actual value of derived type */ 
struct der non node ‘*dnn_ next node; 


Ip 


struct ent_dbid_ node 
/* structure def for each entity-relationship dbid node */ 


{ 


char edn name{[DBNLength + 1]; 
struct ent non node ‘*edn_nonentity; 


int edn num _nonent; /* number of nonentity types */ 
struct ent node *edn_ entity; 

int edn num ent; /* number of entity types */ 

struct gen sub node ‘edn subptr; 

int edn num gen; /* number of gen subtypes */ 

struct sub non node ‘edn nonsubptr; 

int edn num_nonsub; /* number of nonentity subtypes */ 
struct der non node ‘edn nonderptr; 

int edn num der; /* number of nonentity derived types */ 


struct ent dbid node *edn next db; 


le 
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APPENDIX C: DAPLEX TRANSLATION ALGORITHMS 


If specified file not terminal entity 
abort CREATE routine. 


While file not equal to specified CREATE file 
Get next supertype | terminal file 


If expression = function lookup IN 
supertype | terminal type 
RETRIEVE ((File = supertype|terminal type) and 
(function = lookup value)) (file.key) 
| ((File = supertype|terminal type) and 
(function = lookup value)) or 
If entity type 
INSERT(<File, new entity _type>. 
_ <new entity _type.key,**>, 
<db function name,lookup value>, 
<database function names, 
values| pointers to values>) 
end if 
end if 
INSERT(<File,supertype|terminal type>, 
<supertype.key|terminal type.key, **), 
<database function names. 
values| pointers to values>) 


END While 





Figure C.1 Generalized Mapping Algorithm For CREATE 


Take given filename and search entity types and 
subtypes in entire database 

for function expressions for filename.key 
RETRIEVE follows from Ist 21 lines from assignment 
if not one of entity types or subtypes 
then 
DELETE actual values or pointer values 


Figure C.2 Generalized mapping algorithm For DESTROY 
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/* Generalized format of for each loop */ 


[RETRIEVE (File = Given filename) (function.key) 
| RETRIEVE (File = Given filename) 
(requested function.key)] 
lfor each (function.key) V (requested function.key) 
{RETRIEVE((File = function) and 
((function.key|requested function.key) =**) 
| and ((function op ’value’) | (function op ’value’) 
and)) (desired function) END 
| RETRIEVE((File = function) and 
((function.key| requested _function.key)=**) 
| and ((function op ’value’) | (function op ’value’) 
and)) (desired function) BY 
(function) END} 
| {RETRIEVE((File = function) and 
((function.key| requested function.key)=**) 
| and ((function op ’value’) | (function op ’value’) 
and)) (desired function) 
for each (function.key) V (requested function.key) 
RETRIEVE((File = function) and 
((function.key| requested function.key)=**) 
| and ((function op ’value’) | (function op ’value’) 
and)) (desired function) 
BY (function) 
for each (function.key) V (requested function.key)| 


Figure C.3 Generalized Mapping Algorithm For FOR EACH Loop 


82 


/* Generalized format of assignment statement o 


if (lookup function IN given filename) 
then begin 
RETRIEVE ((File = given filename) and 
(lookup function = value)) 
(given filename function.key) 


end begin 

else begin 

while (given filename != terminal subtype) and 
(given function != function name | 
descendant function name) 

do 


RETRIEVE (File = descendant file) 
(descendant function name(s)) 


end_ while 
while ((given filename = terminal subtype) and 
(lookup function != function name | 


ancestor function name)) 
or ((descendant f file = terminal subtype) and 
(lookup f function != function name | 
ancestor function name)) 
do 
RETRIEVE (File = ancestor _file) 
(ancestor function name(s)) 
end_ while 
RETRIEVE ((File = descendant file | ancestor file) 
and (lookup function = value)) 
(root function.key) 
end else_begin 
if loop function in ancestor file to given file 
for each root function.key lg given filename function.key 
[RETRIEVE ((File = Given filename) and — 
(given filename function.key = root function.key | 
given filename function. key)) 
(given filename function.key) 
end if 
for each (root function.key | 
given filename function.key) 
repeat 
/* find function */ 
{ for each indirect function expression 
if ((expression filename) and 
(expression function = 
expression filename function name)) 
then begin 
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RETRIEVE ((File = expression filename) and 
(expression function = value)) 
(expression filename function.key ) 

end_ begin 


else begin 
while (expression filename != terminal subtype) and 
(expression function != function_name | 


descendant function name) 
do 
RETRIEVE (File = descendant_file) 


(descendant function_name(s)) 


end while 
while ((expression filename = terminal subtype) and 
(expression function != function name | 


ancestor function name)) 
or ((descendant file = terminal subtype) and 
(expression function != function name | 
ancestor function name)) 
do 
RETRIEVE (File = ancestor file) 
(ancestor function _name(s)) 
end_ while 
RETRIEVE ((File = descendant file | 
ancestor file) and 
(expression function = value)) 
(expression filename function.key) 
end else begin 
if expression function in ancestor file to 
expression file 
for each expression filename function.key 
[RETRIEVE ((File = expression filename) and 
(expression filename function.key = *)) 
(expression filename function.key | 
laggregate oper|(actual value)) 
end if 


if ((given filename) and 
(given loop function = 
given filename function name)) 
then 
loop function file = given_filename 
end if 
while (given filename != terminal subtype) and 
(given loop function != function name | 
descendant function name) 


do 
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RETRIEVE (File = descendant file) 
(descendant function name(s)) 
loop_function file = descendant filename 
end_ while 
while ((given_filename = terminal subtype) and 
(given loop function != function name | 
ancestor function _name)) 
or 
((descendant file = terminal subtype) and 
(given loop function != function name | 
ancestor function name)) 
do 
RETRIEVE (File = ancestor _file) 
(ancestor function _name(s)) 
loop function file = ancestor filename 
end _ while 
UPDATE ((File = loop function file) and 
(loop file.key = root function.key | 
given filename function.key) and 
(given loop function = actual value | 
indirect value)) 
/* where indirect value = 
expression filename function.key 
| [aggregate operation|(actual value) */ 
/*and loop function file = ancestor _file | 
descendant _file| given filename*/ 
if (given loop function != filename function name) 
then error; 
if (given loop function and value are disjoint) 
then error; 
until no more loop functions 


Figure C.4 Generalized Mapping Algorithm For ASSIGNMENT 
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/* null value into existing set = error */ 
/* duplicates just don’t get added */ 


Copy lst 21 lines from assign statement 


if function expression != actual value 
then do lookup of function expression value 
lookup: 


from INCLUDE 
RETRIEVE files on specified functions (may be 
more than one file to retrieve depending 
upon number and type of function values 
to key on from WHERE) 
from INTO 
if target function does not receive an actual 
value or target function does not receive its 
pointer value directly from a file in the 
INCLUDE, then before the pointer value can be 
INSERT into the target function, an INSERT 
will be required in the file pointed to by the 
target function to provide the required pointer 
value indirectly (see Example 1). 
/* for check and unchanged values */ 
RETRIEVE given filename 
/* new variable names in code */ 
INSERT actual values or pointer values 


Figure C.5 Generalized Mapping Algorithm For INCLUDE 
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RETRIEVE (File = descendant file) 
(descendant function name(s)) 
loop function file = descendant filename 
end _ while 
while ((given_filename = terminal subtype) and 
(given loop function != function _name | 
ancestor function name)) 
or 
((descendant file = terminal subtype) and 
(given loop function != function name | 
ancestor function name)) 
do 
RETRIEVE (File = ancestor file) 
(ancestor function name(s)) 
loop function file = ancestor filename 
end_while 
UPDATE ((File = loop function file) and 
(loop file.key = root_function.key | 
given filename function.key) and 
(given loop function = actual value | 
indirect value)) 
/* where indirect value = 
expression filename _function.key 
| [aggregate operation|(actual value) */ 
/*and loop function file = ancestor file | 
descendant file | given_filename*/ 
if (given loop function != filename function name) 
then error: 
if (given loop function and value are disjoint) 
then error; 
until no more loop functions 


Figure C.4 Generalized Mapping Algorithm For ASSIGNMENT 
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/* null value into existing set = error */ 
/* duplicates just don’t get added */ 


Copy Ist 21 lines from assign statement 


if function expression != actual value 
then do lookup of function expression value 
lookup: 


from INCLUDE 
RETRIEVE files on specified functions (may be 
more than one file to retrieve depending 
upon number and type of function values 
to key on from WHERE) 
from INTO 
if target function does not receive an actual 
value or target function does not receive its 
pointer value directly from a file in the 
INCLUDE, then before the pointer value can be 
INSERT into the target function, an INSERT 
will be required in the file pointed to by the 
target function to provide the required pointer 
value indirectly (see Example 1). 
/* for check and unchanged values */ 
RETRIEVE given filename 
/* new variable names in code */ 
INSERT actual values or pointer values 


Figure C.5 Generalized Mapping Algorithm For INCLUDE 
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if function expression != actual value 
then do lookup of function expression value 
lookup: 
from EXCLUDE 
RETRIEVE files on specified functions (may be 
more than one file to retrieve depending 
upon number and type of function values 
to key on from WHERE) 
from FROM 
if target function does not receive an actual 
value or target function does not receive its 
pointer value directly from a file in the 
EXCLUDE, then before the pointer value can be 
DELETE into the target function, a DELETE will 
be required in the file pointed to by the 
target function to eliminate the indirectly 
referenced pointer value (see Example 1). 
DELETE actual values or pointer values 
/* new variable names in code */ 


Figure C.6 Generalized Mapping Algorithm For EXCLUDE 
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If ((entity valued expression = entity base type) 
or (FROM entity_type name = entity base type) or 
(INTO entity type name = entity base type)) 
then abort MOVE 
If (function name = 
function name of FROM entity type name) 
then abort MOVE 
If (entity valued expression would invalidate 
constraint in INTO entity type name) and 
(FROM entity type name != 
entity valued expression) 
then abort MOVE 
If ((entity valued expression = 
FROM entity type name) 
and (supertype of entity valued expression 
does not point to valid terminal subtype)) 
then abort MOVE 


If (File in entity valued expression) and 
(given search conditions are functions 
in entity valued expressions) 
then 
RETRIEVE ((File = file in entity valued expression) 
and 
(search database function names = 
given values in entity valued expressions) ) 
(file.key) 
If (File in entity valued expression) and 
(given search conditions are not functions in 
entity valued expressions) 


then 
RETRIEVE (File = file in entity valued expression) 
(file.key) 
until 


RETRIEVE ((File=Corres file of search function names) 
and (corres.key = file.key) and 
(search database function names = 
given values in entity valued expressions) ) 

(corres.key) 
i LOOP PARAMETER ay 
If (entity valued expression = 
entity valued loop parameter) 

then 

for each (file.key | corres.key) 

end if 
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While (FROM entity type name(s)) and 
not terminal subtype(s)) 
do 
DELETE ((File = (entity type name(s) | 
terminal subtype name(s)) and 
((entity type name| terminal subtype).key = 
file.key | corres.key)) 
end while 


While (INTO entity type name(s)) and 
not (terminal subtype(s)) 
do 
If (function name expression points to entity with 
pointer value) 
RETRIEVE ((File = (entity type(s) | 
terminal(s)) db function name) and 
(db function name = 
given value in db function expression)) 
((entity_type | terminal) db function name.key) 
end if 
INSERT (<File, (entity type name(s) | 
terminal subtype name(s)>, 
<(entity type | terminal type).key, **)>, 
<db function name, 
((user | system) default value | 
(completely specified user value) 
| pointer to db function name(s)> 


Figure C.7 Generalized Mapping Algorithm for MOVE 
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